Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-17-2022 02:45 AM
I have 2 (very similar) questions regarding the WITH wildcard (*): Note that I'm not concerned by variable name clashes, as any variable introduced is guaranteed to have a unique name.
The purpose of this is to find out wether I can simplify the query generation tool I'm building by always using the wildcard. Otherwise it requires me to keep track of which variables are used where, and which need to passed between scopes.
Solved! Go to Solution.
06-17-2022 10:24 AM
From a clarity perspective it's better to call the ones out that are passed through.
I mostly use WITH * when I want to apply an in-between filter or pagination.
Performance wise - it can reduce the width of the "register" that cypher has to carry through so it can free up some memory for those things that are no longer needed.
50 variables sounds also really dangerous, like generated query, you should be careful with those, and make sure you profile them.
Also if you generate unique names (i.e. UUIDs) then the cypher planner and parser cannot cache your query plans as every query will be a unique new one and you'll lose a lot of performance from replanning on every request.
06-17-2022 07:26 AM - edited 06-17-2022 07:27 AM
I don’t know the answer, but I highly doubt it with any practical query, where the number of variables would be small. Of course, this would not work for a ‘with’ clause with aggregate functions.
not sure what your project is, but maybe neo4j DSL will help you build your queries easier than creating them as strings.
06-17-2022 10:13 AM
Thanks, your point about the aggregation functions is an important one.
06-17-2022 10:24 AM
From a clarity perspective it's better to call the ones out that are passed through.
I mostly use WITH * when I want to apply an in-between filter or pagination.
Performance wise - it can reduce the width of the "register" that cypher has to carry through so it can free up some memory for those things that are no longer needed.
50 variables sounds also really dangerous, like generated query, you should be careful with those, and make sure you profile them.
Also if you generate unique names (i.e. UUIDs) then the cypher planner and parser cannot cache your query plans as every query will be a unique new one and you'll lose a lot of performance from replanning on every request.
06-17-2022 11:07 AM
Thanks Michael, your comments have led me to the decision to avoid the wildcard, except for the use case you mention of pagination or filtering. Related to this, it made me wonder if there's any reason, other than the clarity and readability, why Cypher doesn't allow one to use a LIMIT or WHERE clause without a prior WITH (or RETURN/MATCH) clause?
I highly doubt a query will ever get close to 50 variables, I just invented a large number to demonstrate my question. Regarding the unique names, they will be created sequentially (i.e. var1, var2...), so there shouldn't be an issue with replanning.
All the sessions of the conference are now available online