Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-16-2022 11:54 PM - edited 11-17-2022 12:00 AM
When my data contain 100 million nodes (half of which were named Material, others names Fgprodno with both property with label only), the following query :
###
CALL apoc.periodic.iterate("MATCH(n:Material), (p:Fgprodno) WHERE n.label = p.label RETURN n,p",
"CREATE (n)-[r:come_from]->(p)",
{batchSize:10000, parallel: true})
###
Can finished in 3 mins.
When my data contain 500 million nodes (the same data structure but larger size), the above query can not work even I try difference batchSize for it, the query just keep running but no anything result show and my CPU usage all reach about 90% for 44 thread (2 Sockets, 44 Cores, 88 Logical processors) . However, the following query
###
CALL apoc.periodic.iterate(
'MATCH (n) RETURN id(n) AS id',
'MATCH (n) WHERE id(n)=id DETACH DELETE n',
{batchSize: 10000, parallel: true });
###
Can be finish in 18 mins. (Of course my data contain only 500 million node) and 2.5 mins for 100 million nodes case (contain not only 100 million nodes but also 50 million relation)
This confused me a lot, they're all just apoc.periodic.iterate but the CREARE relation one only work for 100 million nodes but the latter one (delete) work for both 100 and 500 millions. Does there any time or memory complexity problem for the CREATE one? But my free disk space and free memory space still have over 200GB or even larger?
Question : Why the query which for CREATE relation can not work
or just HOW can I modified it to work? Thanks.
11-20-2022 02:28 AM
Hi @Peter_Lian,
Before going into more detail, have you tried with parallel: false?
All the sessions of the conference are now available online