Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-03-2020 12:36 PM
My current code below runs for a long time for 100001 alias nodes:
CALL apoc.periodic.iterate("MATCH (a:alias) RETURN a",
"Match
path=((a) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
With a, b, p1, p2, 2 as precision
WITH a, b, p1, p2, 10^precision as factor
Create (a)-[e:through_topic]->(b)
Set e.weight= round(factor* (1/(2+p1.weight+p2.weight))) / factor", {batchSize:1000}) YIELD batches, total, errorMessages
When I ran for a single alias
Match
path=((a:alias {name: 293} ) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
With a, b, p1, p2, 2 as precision
WITH a, b, p1, p2, 10^precision as factor
Create (a)-[e:through_topic]->(b)
Set e.weight= round(factor* (1/(2+p1.weight+p2.weight))) / factor
completed in 1 or 2 ms. Should I try to optimize my code or play more with the batchsize of the apoc.periodic.iterate or both? I had no luck decreasing the batchsize.
I ran EXPLAIN and PROFILE with
Thanks,
Lavanya
Solved! Go to Solution.
02-03-2020 12:55 PM
You may want to rearrange your query somewhat, doing the heavy lifting of the MATCH and calculation in your driving query, and only doing the CREATE in the updating query:
CALL apoc.periodic.iterate("MATCH path=((a:alias) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
WITH a, b, p1, p2, 2 as precision
WITH a, b, p1, p2, 10^precision as factor
WITH a, b, round(factor* (1/(2+p1.weight+p2.weight))) / factor as weight
RETURN a, b, weight",
CREATE (a)-[e:through_topic]->(b)
SET e.weight= weight", {batchSize:5000}) YIELD batches, total, errorMessages
As for execution time, if you're seeing around 500k rows being processed for just a single alias, then yes I would expect that this could take a long time.
You may also want to check your memory settings with neo4j-admin memrec.
02-03-2020 12:55 PM
You may want to rearrange your query somewhat, doing the heavy lifting of the MATCH and calculation in your driving query, and only doing the CREATE in the updating query:
CALL apoc.periodic.iterate("MATCH path=((a:alias) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
WITH a, b, p1, p2, 2 as precision
WITH a, b, p1, p2, 10^precision as factor
WITH a, b, round(factor* (1/(2+p1.weight+p2.weight))) / factor as weight
RETURN a, b, weight",
CREATE (a)-[e:through_topic]->(b)
SET e.weight= weight", {batchSize:5000}) YIELD batches, total, errorMessages
As for execution time, if you're seeing around 500k rows being processed for just a single alias, then yes I would expect that this could take a long time.
You may also want to check your memory settings with neo4j-admin memrec.
All the sessions of the conference are now available online