Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-29-2021 05:44 PM
I want to clone a graph and set the nodes attributes of the new copy based on a array of dictionary. I did the following query:
MATCH (rootSkill:Skill{name: 'null'}),
(rootStudent:Student{id: '9f828c12-5134-409a-846e-bbc5a6463bff'})
CALL apoc.path.subgraphAll(rootSkill, {relationshipFilter:'PARENT>|DEPENDS_ON>'})
YIELD nodes, relationships
CALL apoc.refactor.cloneSubgraph(
nodes,
[rel in relationships WHERE type(rel) = 'PARENT' OR type(rel) = 'DEPENDS_ON'],
{ standinNodes:[[rootSkill, rootStudent]],
skipProperties:['id'] })
YIELD input, output, error
WITH collect(output) as nodes, rootStudent
UNWIND nodes as node
UNWIND
[{skill: "Skill 1", value: 0.9999999863}, {skill: "Skill 2", value: 0.3}]
as score
WITH DISTINCT score, node, rootStudent
WHERE node.name = score.skill
SET node.id = apoc.create.uuid(), node.score = score.value
WITH rootStudent
MATCH (student:Student{id: '9f828c12-5134-409a-846e-bbc5a6463bff'})-[relation:PARENT]->(s:Skill)
CREATE (student)-[learned:LEARNED {version: 1, created_at: timestamp()}]->(s)
DELETE relation
RETURN DISTINCT learned;
It works perfectly, but the problem is that when the array is large
[{skill: "name 1", value: 0.9999999863}, {skill: "name 2", value: 0.3}, ...]
I got a MemoryPoolOutOfMemoryError. I am using neo4j Aura and I can't change the neo4j.conf file. Is there anyway that I can optimize this query? Can i split it in multiple parts?
07-07-2021 05:41 AM
Look into apoc.periodic.iterate and break the query up into two pieces -- the first enumerates what you have to do and the second takes that action on batches. This query probably won't work exactly, but it'll give you the right general process:
CALL apoc.periodic.iterate(
"MATCH (rootSkill:Skill{name: 'null'}),
(rootStudent:Student{id: '9f828c12-5134-409a-846e-bbc5a6463bff'})
CALL apoc.path.subgraphAll(rootSkill, {relationshipFilter:'PARENT>|DEPENDS_ON>'})
YIELD nodes, relationships
CALL apoc.refactor.cloneSubgraph(
nodes,
[rel in relationships WHERE type(rel) = 'PARENT' OR type(rel) = 'DEPENDS_ON'],
{ standinNodes:[[rootSkill, rootStudent]],
skipProperties:['id'] })
YIELD input, output, error
RETURN output, rootStudent AS node",
"UNWIND
[{skill: "Skill 1", value: 0.9999999863}, {skill: "Skill 2", value: 0.3}]
as score
WITH DISTINCT score, node, rootStudent
WHERE node.name = score.skill
SET node.id = apoc.create.uuid(), node.score = score.value
WITH rootStudent
MATCH (student:Student{id: '9f828c12-5134-409a-846e-bbc5a6463bff'})-[relation:PARENT]->(s:Skill)
CREATE (student)-[learned:LEARNED {version: 1, created_at: timestamp()}]->(s)
DELETE relation
RETURN DISTINCT learned;
",
{ batchSize: 1000, parallel: false });
Note I got rid of one of your UNWINDs. The first query feeds a stream of results to the second mutating query.
07-11-2021 06:28 AM
Hi @david.allen ! Thanks for the answer.
I tried the following:
CALL apoc.periodic.iterate(
"MATCH (rootSkill:Skill{name: 'null'}),
(rootStudent:Student{id: '9f828c12-5134-409a-846e-bbc5a6463bff'})
CALL apoc.path.subgraphAll(rootSkill, {relationshipFilter:'PARENT>|DEPENDS_ON>'})
YIELD nodes, relationships
CALL apoc.refactor.cloneSubgraph(
nodes,
[rel in relationships WHERE type(rel) = 'PARENT' OR type(rel) = 'DEPENDS_ON'],
{ standinNodes:[[rootSkill, rootStudent]],
skipProperties:['id'] })
YIELD input, output, error
RETURN output as item, rootStudent",
"UNWIND
[{skill: 'Skill 1', value: 0.9999999863}, {skill: 'Skill 2', value: 0.3}]
as score
WITH DISTINCT score, item, rootStudent
WHERE item.name = score.skill
SET item.id = apoc.create.uuid(), item.score = score.value
WITH rootStudent
MATCH (student:Student{id: '9f828c12-5134-409a-846e-bbc5a6463bff'})-[relation:PARENT]->(s:Skill)
CREATE (student)-[learned:LEARNED {version: 5, created_at: timestamp()}]->(s)
DELETE relation
RETURN DISTINCT learned;
",
{ batchSize: 1000, parallel: false });
it ran with no errors but no node was created.
I tried to return each node from the cloneSubgraph output and pass to the next iterate to process and update accordantly. Any ideas why it doesn't work? Thanks
All the sessions of the conference are now available online