Neo4j

starz10de · ‎03-15-2022

I have a graph containing many nodes. Each node in the graph has the same properties with different values. I loaded the graph efficiently using:

CALL apoc.periodic.iterate('LOAD CSV FROM $url AS data FIELDTERMINATOR "," return data',
'CREATE (:G {x: data[0],y: data[1]})'
,{batchSize:1000, iterateList:true, parallel:trueparams:{url:"file:///file.csv"}});

I indexed the nodes based on properties.
I wanted to link nodes based on similar property value. I used:

MATCH (a:G )
MATCH (b:G)
WHERE a.x= b.y
MERGE (a)-[:similar]->(b)
RETURN *

However, this solution is very slow, is there any hints about other way to do this? Also is it possible to create the relationships during the data loading inside the CALL apoc.periodic.iterate block ?

glilienfield · ‎03-15-2022

I guess it is taking some time because it has to evaluate each node against every other node to find a match. If you have N nodes, it would be N*N comparisons since symmetry counts and it will compare each node to itself.

The issue I see with you combining the node and relationship creation is that you need to create the two nodes first and I assume the order of your import file is random.

Can you rearrange your import file so each line contains the two matching nodes.

Neo4j

Creating relationships efficiently using Apoc