cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Avoid cartesian product when create relationships

wdefu
Node Link

Hi! I'm new to neo4j and I try to find some help here.
I have two sets of nodes say one is people and another is car.I want to create relationship between these two sets of nodes but I don't want to create relationships base on cartesian product of result return by match since it will take much time.
The cypher statement I use like below:

MATCH (a:test_11),(b:test_12)
WHERE a.car ="0" AND b."price" = "1"
MERGE (a)-[r:own{time:2020,location:"LA"}]->(b)

Is there a way to write cyher which can avoid generating cartesian product between a and b?

1 ACCEPTED SOLUTION

If you collect them between matching them, sure:

MATCH (a {property1:0})
WITH collect(a) as aNodes
MATCH (b {property2:1})
WITH aNodes, collect(b) as bNodes

(also you should be using labels, otherwise these will be expensive allNodesScans, two of them).

So now you have a list of 100 a nodes and a list of 100 b nodes. To connect them to each other you either need two foreaches (one nested in the other), or to UNWIND both lists back to rows. That would get you a cartesian product as a result, but without doing the extra work from the back-to-back matches, then you create your relationships.

View solution in original post

9 REPLIES 9

So, you already have nodes for people and nodes for cars in the graph?
Are there existing relationships between them?
How do you decide which ones you want to relate.
Do you have the id of a Car and the id of a Person and want to put a relationship between them?
I'm not understanding your criteria for wanting to relate.

Yes,nodes are already in the graph.There is no relationships between them.The problem is ,I want to create relationships between nodes which satisfy some properties,not specific nodes,hence will generate cartesian products.In a word,I want to match two sets of nodes in one cell without cartesian products

If you have a WHERE condition on both of your inputs, then it's not a full cartesian product.

But if your WHERE condition for (a) and (b) is not a node-unique key, then you end up in a situation where (a) and (b) get bound to lots of different nodes. Say you have 5 a's and 10 b's, then you get a cross-product of 50 total relationships created, which I'm guessing isn't what you want.

The solution is to have unique keys on each node, and then do exactly what you're doing, except instead of matching car=0 and price=1, match on the IDs so you get exactly 1 a and 1 b.

Thank you David.That's a method to solve my question.But is there a way to avoid cartesian product when executing two matches together?Like just 5 a's and 10 b's not 50 in total instead.

Well that depends on what you want to do with them. If the intent is to create relationships between every one of those a nodes and every one of those b nodes, then a cartesian product for the nodes matched is what you want.

If you want a query that creates some different set of relationships, then you probably want something different.

You said you want to "match two sets of nodes in one cell without cartesian products", it may be helpful for you to provide an example of your desired input and output.

I do want to match two sets of nodes.Here is the scenario,set a has 1000 nodes and set b has 1000 nodes.There are 100 nodes which have a target attribute value and 100 nodes in b as well.I want to connect these 200 nodes. To do it,first it need to do two match:
match (a {property1:0})
match (b {property2:1})
which will return 10000 a and 10000 b which will take much time( around 10 mins on my mechine).Is there a way to just return 100 a and 100 b?

If you collect them between matching them, sure:

MATCH (a {property1:0})
WITH collect(a) as aNodes
MATCH (b {property2:1})
WITH aNodes, collect(b) as bNodes

(also you should be using labels, otherwise these will be expensive allNodesScans, two of them).

So now you have a list of 100 a nodes and a list of 100 b nodes. To connect them to each other you either need two foreaches (one nested in the other), or to UNWIND both lists back to rows. That would get you a cartesian product as a result, but without doing the extra work from the back-to-back matches, then you create your relationships.

Thank you andrew!That's exactly what I need!

clem
Graph Steward

I believe this is simpler:

MATCH (a:test_11)
MATCH (b:test_12)
WHERE a.car ="0" AND b."price" = "1"
MERGE (a)-[r:own{time:2020,location:"LA"}]->(b)

The WHERE clause filters by condition and avoids the cartesian product. The cartesian product occurs with MATCH (a:test_11),(b:test_12)

It's a subtle distinction I just learned today!