cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Create a relationship between a graph vertex and all vertices reachable via a specific path

Hello community!

I need to connect graph nodes with other nodes reachable through a certain path. As a result, I want to create a RELATED relationship between the Source nodes and all A nodes that are available through the RELATED / MEMBER_OF path of any depth. See the attached example, it produces the correct result.

The starting node has Source label. The nodes with which it needs to be connected - A.
The Source node can be connected with several nodes of type A directly by the directional Related relation.
Nodes A can be connected by the directional relationship MEMBER_OF.
Connections from any A node to T node are possible with a directed REQUIRE or GIVES_ACCESS relation.

See example:

I am able to use the APOC library.

I have the code example that does the job:

MATCH (s:Source)
CALL apoc.path.subgraphAll(s, {
	relationshipFilter: "RELATED>|MEMBER_OF>",
    labelFilter:'+A',
    minLevel: 1
})
YIELD nodes
UNWIND nodes as node
MERGE (s)-[:RELATED]-(node)

but its performance is not good enough for me. I need something that will work faster.
Dataset for work - a couple of million nodes. Several hundred or thousands of Sources and more than a million groups connected to each other as described above. There are Sources that include almost all existing groups, there are those that include a couple of thousand groups. Sources are unique - there is no one that would contain the same set of groups with which they are directly associated through RELATED relationships.

Any solution that works faster than is considered better

Neo4j version: 4.3.3

Upd.
Attaching an example that can be used to recreate the test data:

sample
CREATE (s:Source {name:"source"})

CREATE (a1:A {name:"A1"})
CREATE (a2:A {name:"A2"})
CREATE (a3:A {name:"A3"})
CREATE (a4:A {name:"A4"})
CREATE (a5:A {name:"A5"})
CREATE (a6:A {name:"A6"})
CREATE (a7:A {name:"A7"})
CREATE (a8:A {name:"A8"})
CREATE (a9:A {name:"A9"})
CREATE (a10:A {name:"A10"})
CREATE (a11:A {name:"A11"})
CREATE (a12:A {name:"A12"})
CREATE (a13:A {name:"A13"})
CREATE (a14:A {name:"A14"})
CREATE (a15:A {name:"A15"})
CREATE (a16:A {name:"A16"})
CREATE (a17:A {name:"A17"})
CREATE (a18:A {name:"A18"})

CREATE (t1:T {name:"T1"})
CREATE (t2:T {name:"T2"})
CREATE (t3:T {name:"T3"})

MERGE (s)-[:RELATED]->(a1)
MERGE (s)-[:RELATED]->(a2)
MERGE (s)-[:RELATED]->(a3)

MERGE (a1)-[:MEMBER_OF]->(a4)
MERGE (a4)-[:GIVES_ACCESS]->(t1)
MERGE (a1)-[:MEMBER_OF]->(a5)
MERGE (a5)-[:MEMBER_OF]->(a6)
MERGE (a6)-[:REQUIRE]->(t1)
MERGE (a1)-[:MEMBER_OF]->(a7)
MERGE (a7)-[:GIVES_ACCESS]->(t1)

MERGE (a2)-[:MEMBER_OF]->(a8)
MERGE (a8)-[:MEMBER_OF]->(a7)
MERGE (a8)-[:MEMBER_OF]->(a9)
MERGE (a8)-[:MEMBER_OF]->(a10)
MERGE (a7)-[:MEMBER_OF]->(a10)
MERGE (a10)-[:GIVES_ACCESS]->(t1)
MERGE (a10)-[:GIVES_ACCESS]->(t2)
MERGE (a10)-[:GIVES_ACCESS]->(t3)

MERGE (a3)-[:MEMBER_OF]->(a11)
MERGE (a3)-[:MEMBER_OF]->(a12)
MERGE (a11)-[:MEMBER_OF]->(a13)
MERGE (a11)-[:MEMBER_OF]->(a14)
MERGE (a13)-[:GIVES_ACCESS]->(t2)
MERGE (a13)-[:GIVES_ACCESS]->(t3)
MERGE (a14)-[:GIVES_ACCESS]->(t3)

MERGE (a15)-[:MEMBER_OF]->(a11)
MERGE (a16)-[:MEMBER_OF]->(a1)
MERGE (a17)-[:GIVES_ACCESS]->(t1)
MERGE (a18)-[:GIVES_ACCESS]->(t3)

RETURN s, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15, a16, a17, a18, t1, t2, t3

Upd:

I was able to speed up the query performance on my dataset from 60 seconds to 1.5 seconds using the following query:

MATCH (s:Source)
CALL apoc.path.subgraphNodes(s, {
	relationshipFilter: "RELATED>|MEMBER_OF>",
    labelFilter:'+A',
    minLevel: 1
})
YIELD node
CREATE (s)-[:RELATED]->(node)
2 REPLIES 2

ameyasoft
Graph Maven
Try this:

MATCH (s:Source)
CALL apoc.path.subgraphAll(s, {
	relationshipFilter: "RELATED>|MEMBER_OF>",
    labelFilter:'+A',
    minLevel: 1
})
YIELD nodes
unwind nodes AS n1
with n1 order by id(n1) ASC
with collect(n1) as n2
CALL apoc.nodes.link(n2, 'RELATED')
RETURN n2

Result:

Hi ameyasoft! Thanks for the reply!

Perhaps I did not correctly describe what should be done, my fault.

As a result, I want to create a RELATED relationship between the Source and all A nodes that are available through the RELATED / MEMBER_OF path of any depth. See the attached example, it produces the correct result.