Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-15-2019 11:56 PM
I received a large file. These two queries work badly.
Now, these queries are needed to be optimized.
Separating queries is a good idea. But, if I split up them, I will go through this big file several times.
What is better to do: separate queries or to make better them?
Might, somebody has ideas about how to optimize them)
CALL apoc.periodic.iterate('WITH apoc.convert.fromJsonList(data) as arr UNWIND arr as v RETURN v' ,' FOREACH ( i in CASE WHEN v.dog=false THEN [1] ELSE [] END | MERGE (c:Cat{id:v.id, version: "{version}"}))
FOREACH ( i in CASE WHEN v.dog=true THEN [1] ELSE END | MERGE (c:Dog{id:v.id, version: "${version}"}))
WITH vMATCH (c{id:v.id, version:"${version}"}) UNWIND RANGE(0,CASE WHEN length(v.weightOfAllCat)>length(v.weightOfAllDog)THEN length(v.weightOfAllCat) ELSE length(v.weightOfAllDog) END) as i MERGE (p:Prod {ean: v.name, version: "${version}"}) MERGE (a:Pro {ean: v.name, version: "${version}"}) WITH v, c, p, a CALL apoc.do.when(v.dog=false, "MERGE (c)-[:PRI]->(p) MERGE (c)-[:ALTER]->(a)", "MERGE (c)-[:PRI_A]->(p) MERGE (c)-[:ALTER_A]->(a)", {v:v, c:c, p:p, a:a}) YIELD value RETURN value ', { batchSize: 5000, iterateList: true, parallel:true, params:{data:'${data}'}})
UNWIND split("{prod}", ",") as prod_id MATCH (p:Prod{id:prod_id, version: "{version}"})<-[a:ALTER]-(c:Cat{version: "{version}"}) WITH max(toInteger(apoc.text.replace(c.id,'[A-Za-z+]', ""))) as max, p MATCH (c:Cat{version: "{version}"})-[:ALTER]->(p)
WHERE toInteger(apoc.text.replace(c.id,'[A-Za-z+]', "")) =max
MATCH (c:Cat{version: "${version}"})-[d:ALTER]->(p)
MERGE (c)-[:PRIM]->(p)
MERGE (c)-[:ALTER_P]->(p)
DETACH DELETE d
RETURN collect(DISTINCT(p.prod_id)) as proc
Solved! Go to Solution.
11-26-2019 02:29 AM
First of all, you are running the apoc.iterate in parallel mode while also adding relationships. When creating a relationship, locks are made on both connected nodes, and you risk a deadlock situation (unless you a sure that no relationships are made to the same nodes in the entire set).
MATCH (c{id:v.id, version:"${version}"})
does not specify label, an index would help.
You are also matching and merging nodes with multiple properties, e.g. MERGE (p:Prod {ean: v.name, version: "${version}"})
– are these indexed as a composite indexes?
Is this imported to an existing database? Otherwise preprocessing the content to CSV files and using neo4j-admin import is likely the fastest approach (depending on the size of the dataset).
11-26-2019 02:29 AM
First of all, you are running the apoc.iterate in parallel mode while also adding relationships. When creating a relationship, locks are made on both connected nodes, and you risk a deadlock situation (unless you a sure that no relationships are made to the same nodes in the entire set).
MATCH (c{id:v.id, version:"${version}"})
does not specify label, an index would help.
You are also matching and merging nodes with multiple properties, e.g. MERGE (p:Prod {ean: v.name, version: "${version}"})
– are these indexed as a composite indexes?
Is this imported to an existing database? Otherwise preprocessing the content to CSV files and using neo4j-admin import is likely the fastest approach (depending on the size of the dataset).
All the sessions of the conference are now available online