Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-10-2022 03:06 AM
My question is regarding how Neo4j searches for the correct relationship type. Let assume that we have two different relationship types connecting nodes under label (ALPHA) with nodes under label BETA
In other words:
( :ALPHA {id:1} )-[ :REL_TYPE_A ]->( :BETA {id:2} )
( :ALPHA {id:1} )-[ :REL_TYPE_B ]->( :BETA {id:3} )
( :ALPHA {id:1} )-[ :REL_TYPE_B ]->( :BETA {id:4} )
...
( :ALPHA {id:1} )-[ :REL_TYPE_B ]->( :BETA {id:1000} )
If I would run the following query
MATCH ( :ALPHA {id:1} )-[ :REL_TYPE_A ]->( :BETA )
would it be required to go through all the REL_TYPES on that path from node ( ALPHA) or would it be able to discard of everything else. In other words, will it take the same amount of time regarding of how many rel types that exist?
07-10-2022 06:51 AM
Hi @alexander
I created the data.
CREATE (alpha:ALPHA {id:1})-[:REL_TYPE_A]->(:BETA {id:2})
UNWIND range(3, 1000) AS idValue
CREATE (alpha)-[:REL_TYPE_B]->(:BETA {id:idValue})
And Run PROFILE MATCH.
PROFILE MATCH (alpha:ALPHA {id:1})-[:REL_TYPE_A]->(beta:BETA)
RETURN alpha, beta
The MATCH search just 1 record with the relationship type REL_TYPE_A..
This Cypher for REL_TYPE_B.
PROFILE MATCH (alpha:ALPHA {id:1})-[:REL_TYPE_B]->(beta:BETA)
RETURN alpha, beta
07-10-2022 08:43 AM
I don't have any knowledge of the internals of neo4j to answer your question definitely, but I would assume it does have to visit each relationship. If you look at the profile resulting from your query, you can see it gets the alpha node, then 'expands all', then filters the results for attached 'beta' nodes. The 'expand all' is finding all the paths that have a 'REL' relationship attached to the 'alpha' node.
I would imagine this is very fast. As an example, I have written a few custom procedures using the Neo4j API, where I navigate a graph recursively from a root node. At each node, I get the node's relationships, get the relationship's 'other node', perform some calculations, then repeat this for each child node. My test data set is small right now, but I have some graphs that may be a max of five deep, with 3-5 relationships per node. This calculation is very fast on my laptop. One of my usages is to show the calculation results in a list view. Even when I do this, the results are instantaneous. I expect it to be much faster in a production environment.
I think the real performance issue is when you begin to increase the path length in a pattern match, as the number of searches can grow very fast.
All the sessions of the conference are now available online