Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-05-2020 02:27 AM
Do we really need to perform 2 queries when doing pagination:
For example: MATCH (u:User) WHERE ... RETURN u ORDER BY u.username SKIP 100 LIMIT 25
already has to get all the (filtered) nodes, sort them, then return just the nodes of the "page" (SKIP/LIMIT) as shown by a PROFILE
query:
I - and possibly millions of other users out there - ALWAYS need to show the total count of matching nodes to the user ... which is the number of rows after the (last) Filter step (4099 in this case).
Wouldn't it be possible for the DB to return this count as part of the query-stats?
Or is this possible with some secret parameter already?
Or is there a performant workaround that I am unaware of?
Cheers, Chris
03-05-2020 04:25 AM
03-05-2020 07:26 AM
I am - of course - already executing 2 (optimized) queries BUT it seams so stupid since the result of the count query is already part of the "real" query - that's why I am asking and proposing that the number should be returned in the query-stats.
03-05-2020 07:42 AM
But if you’re only returning 25, it doesn’t know the complete count
I was only commenting on the indexes, since the profile output does not look very optimized…
03-06-2020 02:00 AM
"But if you’re only returning 25, it doesn’t know the complete count"... it HAS to know the total count in the query plan (see 4099 rows in the screenshot of my original post) in order to sort and then skip/limit and THAT number is what we need
08-10-2020 07:34 AM
Ah, I totally see what you're up to.
I'm completely new to Neo4j, but I might have found a solution for you.
The key problem we're trying to solve here is "dragging" along a mid-query value to the final RETURN
statement. The obvious solution would be to have multiple RETURN
s, which, unfortunately, is not possible.
So the unsatisfying solution to still have two RETURN
s is two do two queries xD
Unless.. we use aggregation / lists.
I have to admit this is not a general solution to the "multiple RETURN
s problem", there are many many more cases where this would help and aggregation can't help out, but in your case it does the job.
MATCH (u:User)
WHERE u.age > 21
WITH u.username as username
ORDER BY username
WITH collect(username) AS usernames // <== usernames is now a list :p
RETURN size(usernames) AS userCount, usernames[0..25] as usernames
I am not familiar with the implementation, but maybe maybe this query isn't even that unperformant 😄 (despite dealing with huge arrays instead of natural streams of records xDD).
Would be nice to hear what your benchmarks are saying @chris3
08-13-2020 10:01 AM
@chris3 @Thomas_Silkjaer
I found another way 😄
My new query should be really really efficient, because in contrast to my first suggestion it does not aggregate a huge list at any point of time, but instead uses ordinary records/rows that are truncated with LIMIT
, just as @chris3 initially requested.
MATCH (u:User)
WHERE u.age > 21
WITH count(u) AS userCount
// now second match (cross product with the single userCount number record)
MATCH (u:User)
WITH u.username as username, userCount
// each username row/record now contains the same global userCount number as a second column
ORDER BY username
LIMIT 25
// from now on we only deal with 25 records (definitely computationally cheap)
RETURN
collect(DISTINCT userCount) AS userCount,
collect(DISTINCT username) AS usernames
Notice that userCount
still is a list here, but this shouldn't be a problem 😉
If you wanted you could unwrap it 😛
// ...
LIMIT 25
WITH
collect(DISTINCT userCount) AS userCountArr,
collect(DISTINCT username) AS usernames
UNWIND userCountArr AS userCount
RETURN userCount, usernames
Hope this helps. I have a lot of fun with Cypher Queries ;D
08-18-2020 04:00 AM
thanks for your answers!
I am using spring-data-neo4j (OGM) and kinda need a result with full nodes (1 per row => List)
I still think that the query plan could/should return this "count before skip/limit" 😉
08-18-2020 11:56 PM
I am trying to figure out why my proposal queries don't work in your setup. Unfortunately I'm not familiar with SpringData Neo4j.
Do you mean that in your setup the users MUST be returned as individual rows and not as a list (single row)?
Then really the only way is to use my second proposal and strip the collect()
aggregation part at the end.
Then each username row would come with the same total count. You then could simply read this number from any row received. Straight-Forward would be the first one.
MATCH (u:User)
WHERE u.age > 21
WITH count(u) AS userCount
MATCH (u:User)
RETURN u.username as username, userCount // <== every row stores this same number
ORDER BY username
LIMIT 25
I know it would be nice to have multiple RETURN
statements that commit to a Response Data Object. But I can assure you that this is currently only possible by running two separate queries.
It don't know if this is applicable to your stack, but for me there would be two possible ways to run these two queries in one database roundtrip:
.runMany()
procedure
10-20-2021 06:34 AM
I know it's over 1 year later but I just wanted to THANK YOU for your reply!
In the meantime I have been working on optimizing some queries in our production systems and I just wanted to let you know that your solution works and also works for OGM.
The only thing I don't like is that every row of the result has the count column (which requires extra domain objects for OGM). "Facets" (see MongoDB) would be nice.
All the sessions of the conference are now available online