Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-28-2019 03:36 AM
I am wondering how this following query works. We start with an empty DB.
WITH [{a:1, b:null}, {a:1, b:null}] AS rows
UNWIND rows AS row
MERGE (t:Test {a:row.a, b:coalesce(row.b, 'none')}) SET t.b=row.b
Added 1 label, created 1 node, set 3 properties, completed after 3 ms.
MATCH (t:Test) return t
╒═══════╕
│"t" │
╞═══════╡
│{"a":1}│
└───────┘
Why does this only create one node?
I would assume the following:
Is coalesce
or the SET
doing here something different?
Note that if I do the two rows as separate queries (one per row) then the result is exactly as I expect and describe above. If the rows are unwinded in one query it's different.
Solved! Go to Solution.
05-28-2019 03:56 PM
If you EXPLAIN or PROFILE the query, you'll see there's an Eager operation in the query plan. This changes how the query will execute
Without an Eager, its execution matches how you described, executing row by row until completion.
With an Eager, the planner has recognized that this type of query will cause the ordering of row processing to have an impact on the results, something we generally want to avoid. The planner will then favor a mode of execution which minimizes the chances for this to happen.
Instead of executing the entirety of the query per row, the planner will instead apply each subsequent operation across all rows, then repeat for the next operation.
So what happens in this case, after the UNWIND, the MERGE will execute for both rows, which will create the node for the first row, and match to that newly created node for the second row. Then SET is applied for both rows, each of them operating on that single node.
05-28-2019 03:56 PM
If you EXPLAIN or PROFILE the query, you'll see there's an Eager operation in the query plan. This changes how the query will execute
Without an Eager, its execution matches how you described, executing row by row until completion.
With an Eager, the planner has recognized that this type of query will cause the ordering of row processing to have an impact on the results, something we generally want to avoid. The planner will then favor a mode of execution which minimizes the chances for this to happen.
Instead of executing the entirety of the query per row, the planner will instead apply each subsequent operation across all rows, then repeat for the next operation.
So what happens in this case, after the UNWIND, the MERGE will execute for both rows, which will create the node for the first row, and match to that newly created node for the second row. Then SET is applied for both rows, each of them operating on that single node.
06-17-2019 11:29 PM
Very interesting, thank you for your explanation!
Is this documented somewhere? I can't remember seeing this behaviour in the docs.
All the sessions of the conference are now available online