Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-17-2021 11:04 AM
I want to run two independent queries in the same query. First I want to create nodes using the nodesData object and then I want to create some relationships using edgesData. Running the two queries one by one gives expected result but combining them produces multiple edges between nodes.
session.run(
// create nodes
` WITH $nodesData as value
UNWIND value as data
CALL apoc.create.node([data.class], data)
YIELD node
// breakpoint. Running above query and then below query gives
// expected result.
WITH $edgesData as value
UNWIND value AS data
MATCH (n {newtId: data.source}), (m { newtId: data.target})
WITH n, m, data
CALL apoc.create.relationship(n,data.class,data,m)
YIELD rel
RETURN rel
`,
{ nodesData: nodesData, edgesData: edgesData }
)
06-17-2021 01:37 PM
The issue is one of cardinality: Cypher operations yield rows. Cypher operations execute per row. This is a critical understanding to keep in mind, that the data you're generating (and operations executing!) in the second query is dependent upon the data in the first query.
Since you're yielding > 1 rows from the first part of the query, subsequent operations are executing per row, redundantly. That's unnecessarily multiplying out not only the work that is being done, but the results yielded at the end.
So the question is, how do we make the data independent? We can aggregate, so we collect the nodes into a single row of nodesData
or nodeCount
(cardinality resets to a single row), and then subsequent operations in the second part of the query only happen once, and no operations or results get multiplied by the input rows. Then you collect the edgesData
(or count into relCount
), and can return that if needed. And if you want more clear separation (as well as protection from cases where either $nodesData or $edgesData is empty), then use subqueries around each:
CALL {
UNWIND $nodesData as data
CALL apoc.create.node([data.class], data) YIELD node
WITH count(node) as nodeCount // needed to protect against empty parameter list
RETURN nodeCount // subqueries must return something
}
WITH nodeCount // only a single row at this point from the earlier aggregation
CALL {
UNWIND $edgesData AS data
// you should be using labels or this will be really really slow!
MATCH (n {newtId: data.source}), (m { newtId: data.target})
WITH n, m, data
CALL apoc.create.relationship(n,data.class,data,m) YIELD rel
WITH count(rel) as relCount
RETURN relCount
}
RETURN nodeCount, relCount
All the sessions of the conference are now available online