Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-10-2022 09:55 PM - edited 07-10-2022 11:58 PM
I am building a Spring Boot service that connects to a neo4j database via the neo4j-java-driver. One REST endpoint builds a paginated search query with skip and limit that looks like the following:
MATCH (i:Item)-[:HAS_PRICE]->(p:Price)
WHERE p.a>= $aMin AND p.a <= $aMax AND p.b >= $bMin
WITH i, p
ORDER BY p.b asc
SKIP $skip
LIMIT $limit
RETURN i.code
With the query, I am expecting 320 unique items but only get 208. Also on some pages there are duplicate items that were already present in one of the previous pages. When returning all items without skip/limit the result is as expected. When not ordering, the result is also correct. The problem appears to be the combination of ordering with skip/limit.
I also tested the same query on the same database in python with the python neo4j driver version 4.4.2. The skip and limit sizes were the same and the query results were correct, no duplicates. Making the same queries in the Neo4j Browser (with the same skip/limit) does also return the correct results.
So I am thinking that the problem may be with the neo4j-java-driver. I tested it with versions 4.4.8, 5.0.0-alpha03 and 4.0.3. All have the same faulty behavior.
Please ask if you need more information and thank you for helping.
07-11-2022 11:46 AM
Edit: Using this solution I found worked. I still have no idea why there has to be a second ordering for it to avoid duplicates.
07-11-2022 12:53 PM
Is it possible for an item to have multiple prices or multiple items to have the same code?
07-11-2022 01:08 PM
The code is constraint to be unique and each item can only have one price
07-11-2022 01:41 PM
I thought that is what you would say. Its not making sense to me. do you mind running the following query?
MATCH (i:Item)-[:HAS_PRICE]->(p:Price)
WHERE p.a>= $aMin AND p.a <= $aMax AND p.b >= $bMin
return count(i) as iCount, count(p) as pCount, count(distinct i) as distinctCount
I assume your $skip value is n*$limit, where 'n' is the page number.
07-11-2022 03:05 PM - edited 07-11-2022 05:27 PM
Yes $skip is pageNumber * $limit
I was running the query with this test method:
public void test() {
String query = String.join(" ",
"MATCH (i:Item)-[:HAS_PRICE]->(p:Price)",
"WHERE p.a >= 7 AND p.a <= 39 AND p.b >= 1",
"return count(i) as iCount, count(p) as pCount, count(distinct i) as distinctCount"
);
try (Session session = driver.session(SessionConfig.forDatabase(databaseName))) {
Record rec = session.run(query).single();
int iCount = rec.get("iCount").asInt();
int pCount = rec.get("pCount").asInt();
int distinctCount = rec.get("distinctCount").asInt();
System.out.println("iCount: "+iCount+", pCount: "+pCount+", distinctCount: "+distinctCount);
}
}
This method produces the following output:
iCount: 320, pCount: 320, distinctCount: 320
And that is the correct number of items that match the criteria.
However, when I execute this test method:
public void test2() {
String query = String.join(" ",
"MATCH (i:Item)-[:HAS_PRICE]->(p:Price)",
"WHERE p.a >= 7 AND p.a <= 39 AND p.b >= 1",
"WITH i, p",
"ORDER BY p.b asc",
"SKIP $skip",
"LIMIT $limit",
"RETURN i.code"
);
int limit = 25;
int maxPage = 13; //25*13 exceeding the total number of 320 expected items
for(int page = 0; page < maxPage; page++){
int skip = page * limit;
System.out.println("ON PAGE: "+ page);
try (Session session = driver.session(SessionConfig.forDatabase(databaseName))) {
session.run(query, parameters("skip", skip, "limit", limit))
.stream()
.forEach(rec -> {
String code = rec.get(0).asString();
System.out.println(code);
});
}
}
}
I also get 320 item codes but for example one item code is present on page 5, 6, 7, 8 and 9 and there are many more duplicate items. When removing the duplicates, the total number of unique items is 208.
EDIT: it works correctly if I add another order by criteria like suggested in this post.
ORDER BY p.b asc, i.foundAt desc
Thank you for helping!!
07-11-2022 03:18 PM
When I run this python function, the result is also correct, 320 unique items with no duplicates.
def test():
limit = 25;
maxPage = 13; #25*13 exceeding the total number of 320 expected items
for page in range(maxPage):
skip = page * limit;
print("ON PAGE: ", page)
query = f"""
MATCH (i:Item)-[:HAS_PRICE]->(p:Price)
WHERE p.a >= 7 AND p.a <= 39 AND p.b >= 1
WITH i, p
ORDER BY p.b asc
SKIP {skip}
LIMIT {limit}
RETURN i.code
"""
res = connection.query(query)
for rec in res:
code = rec[0]
print(code)
All the sessions of the conference are now available online