Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-15-2022 11:34 AM
I have a graph containing many nodes. Each node in the graph has the same properties with different values. I loaded the graph efficiently using:
CALL apoc.periodic.iterate('LOAD CSV FROM $url AS data FIELDTERMINATOR "," return data',
'CREATE (:G {x: data[0],y: data[1]})'
,{batchSize:1000, iterateList:true, parallel:trueparams:{url:"file:///file.csv"}});
I indexed the nodes based on properties.
I wanted to link nodes based on similar property value. I used:
MATCH (a:G )
MATCH (b:G)
WHERE a.x= b.y
MERGE (a)-[:similar]->(b)
RETURN *
However, this solution is very slow, is there any hints about other way to do this? Also is it possible to create the relationships during the data loading inside the CALL apoc.periodic.iterate
block ?
03-15-2022 12:42 PM
I guess it is taking some time because it has to evaluate each node against every other node to find a match. If you have N nodes, it would be N*N comparisons since symmetry counts and it will compare each node to itself.
The issue I see with you combining the node and relationship creation is that you need to create the two nodes first and I assume the order of your import file is random.
Can you rearrange your import file so each line contains the two matching nodes.
All the sessions of the conference are now available online