Neo4j

alexander · ‎07-10-2022

My question is regarding how Neo4j searches for the correct relationship type. Let assume that we have two different relationship types connecting nodes under label (ALPHA) with nodes under label BETA

1x (ALPHA) - [ REL_TYPE_A ] -> (BETA)
1000x (ALPHA) - [ REL_TYPE_B ]-> (BETA)

In other words:

( :ALPHA {id:1} )-[ :REL_TYPE_A ]->( :BETA {id:2} )
( :ALPHA {id:1} )-[ :REL_TYPE_B ]->( :BETA {id:3} )
( :ALPHA {id:1} )-[ :REL_TYPE_B ]->( :BETA {id:4} )
...
( :ALPHA {id:1} )-[ :REL_TYPE_B ]->( :BETA {id:1000} )

If I would run the following query

MATCH ( :ALPHA {id:1} )-[ :REL_TYPE_A ]->( :BETA )

would it be required to go through all the REL_TYPES on that path from node ( ALPHA) or would it be able to discard of everything else. In other words, will it take the same amount of time regarding of how many rel types that exist?

koji · ‎07-10-2022

Hi @alexander

I created the data.

CREATE (alpha:ALPHA {id:1})-[:REL_TYPE_A]->(:BETA {id:2})
UNWIND range(3, 1000) AS idValue
CREATE (alpha)-[:REL_TYPE_B]->(:BETA {id:idValue})

And Run PROFILE MATCH.

PROFILE MATCH (alpha:ALPHA {id:1})-[:REL_TYPE_A]->(beta:BETA)
RETURN alpha, beta

The MATCH search just 1 record with the relationship type REL_TYPE_A..
REL_TYPE_A

This Cypher for REL_TYPE_B.

PROFILE MATCH (alpha:ALPHA {id:1})-[:REL_TYPE_B]->(beta:BETA)
RETURN alpha, beta

glilienfield · ‎07-10-2022

I don't have any knowledge of the internals of neo4j to answer your question definitely, but I would assume it does have to visit each relationship. If you look at the profile resulting from your query, you can see it gets the alpha node, then 'expands all', then filters the results for attached 'beta' nodes. The 'expand all' is finding all the paths that have a 'REL' relationship attached to the 'alpha' node.

I would imagine this is very fast. As an example, I have written a few custom procedures using the Neo4j API, where I navigate a graph recursively from a root node. At each node, I get the node's relationships, get the relationship's 'other node', perform some calculations, then repeat this for each child node. My test data set is small right now, but I have some graphs that may be a max of five deep, with 3-5 relationships per node. This calculation is very fast on my laptop. One of my usages is to show the calculation results in a list view. Even when I do this, the results are instantaneous. I expect it to be much faster in a production environment.

I think the real performance issue is when you begin to increase the path length in a pattern match, as the number of searches can grow very fast.

Neo4j

Relationship type question