Neo4j

pg466799665 · ‎09-05-2021

Hi all,

My problem is this:
The big graph has almost 2000-5000 nodes and 2000-15000 relationships with only 4 different labels A, B, C, D and their own properties.
Now, i have lots of subgraph structure, i need to search subgraph and find all the possible results. It should be exactly matched and filter the properties that are given.

Currently, i have tried cypher to get the result, but the performance is not that good and there are lots of duplicated nodes in result. From my perspective, my data size is pretty small but unfortunately the performance is not ideal.

I think there must be something wrong with the way to organize my query language.

Is there anyone can give some advice? Any tips will be helpful!

Thanks 🙂

ameyasoft · ‎09-05-2021

Please explain your problem little bit more, especially with respect to 'subgraph'. If I understand correctly, you have A, B, C, D nodes and node A has it's own child nodes and is same with other nodes. Let me know if this is correct.

pg466799665 · ‎09-10-2021

Hi,

Sorry for the rough problem description. Yes, the 4 nodes have their own child nodes but in order to make it simple i will ignore child nodes.

here is kind of a subgraph i want to query in a big graph:
tips: the relationship are the same, and the direction can be ignored

here is my cypher query, the performance is sad so i hope anyone can help me to tuning or give me some advice.

match (AbtoAa1:A)--(AatoAb1:A)
where not AatoAb1.pid in [AbtoAa1.pid]
match (Abtoa2:A{pid:AbtoAa1.pid})--(AatoAb2:A{pid:AatoAb1.pid})
where size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2)])) = 2 and size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2)])) = 2

match (Abtoa3:A{pid:AbtoAa1.pid})--(AatoAb3:A{pid:AatoAb1.pid})
where size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2),id(AbtoAa3)])) = 3 and size(apoc.coll.toSet([id(AatoAb1), id(AatoAb2),id(AatoAb3)])) = 3

match (AbtoBb1:A{pid:AbtoAa1.pid})--(BatoAb1:B)
where size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2),id(AbtoAa3),id(AbtoBa1)])) = 4

match (AbtoBa2:A{pid:AbtoAa1.pid}--(BatoAb2:B{pid:BatoAb1.pid})
where size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2),id(AbtoAa3),id(AbtoBa1), id(AbtoBa2)])) = 5 and size(apoc.coll.toSet([id(BatoAb1), id(BatoAb2)])) = 2

match (AbtoBa3:A{pid:AbtoAa1.pid}--(BatoAb3:B{pid:BatoAb1.pid})
where size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2),id(AbtoAa3),id(AbtoBa1), id(AbtoBa2),id(AbtoBa3)])) = 6 and size(apoc.coll.toSet([id(BatoAb1), id(BatoAb2),id(BatoAb3)])) = 3

match (AbtoCa1:A{pid:AbtoAa1.pid}--(CatoAb1:B)
where size(apoc.coll.toSet([id(AbtoAa1), id(AbtoAa2),id(AbtoAa3),id(AbtoBa1), id(AbtoBa2),id(AbtoBa3)],id(AbtoCa1))) = 7

match (AatoAc1:A {pid:AatoAb1.pid})--(ActoAa1:A)
where not ActoAa1.pid in [AbtoAa1.pid, AatoAb1.pid] and size(apoc.coll.toSet([id(AatoAb1), id(AatoAb2),id(AatoAb2), id(AatoAb3), id(AatoAc1)])) = 4


match (ActoBa1:A{pid:ActoAa1.pid})--(BatoAc1:B{pid:BatoAb1.pid})
where size(apoc.coll.toSet([id(ActoAa1), id(ActoBa1)])) = 2 and size(apoc.coll.toSet([id(BatoAb1), id(BatoAb2), id(BatoAb3), id(BatoAc1)])) = 4

match (ActoCa1:A{pid:ActoAa1.pid})--(CatoAc1:C{pid:CatoAb1.pid})
where size(apoc.coll.toSet([id(ActoAa1), id(ActoBa1),id(ActoCa1)])) = 3 and size(apoc.coll.toSet([id(CatoAb1), id(CatoAc1)])) = 2

return

andrew_bowman · ‎09-24-2021

Sorry, but this is really hard to understand what you are doing and why.

Is there any way you can describe a scenario in terms that are easier to understand? Some kind of use case, restrictions and conditions that are more intuitive?

A sample dataset that we could use to generate a simple graph, and then queries to run on it, might be better for demonstrating your model and your use case.

Neo4j

Subgraph query in graphDB