cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to count relationships of nodes in subgraph

I have, let's say a graph with 50 nodes. With Cypher I find 3 possible paths from node A to node B (the result of the Cypher query is also a graph .. a sub-graph)

MATCH p=((a:label {Pro1:'A'})-[*7]->(b:label {Pro1:'B'})) RETURN p

On the path from A to B (in the sub-graph) there are several nodes. For example the node C in this sub-graph has 3 relationships.

How can I count the relationships of the C node in this sub-graph?

Regards and thank you

1 ACCEPTED SOLUTION

I was able to figure out a method of removing the redundant nodes earlier in the query to avoid processing them.

match p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'}))
with reduce(s=, i in collect(nodes(p)) | i + s) as allSubGraphNodes
unwind allSubGraphNodes as node
with distinct node, allSubGraphNodes
call {
with node, allSubGraphNodes
match(c)--(node)
where (c in allSubGraphNodes)
return count(c) as count
}
return node, count

View solution in original post

8 REPLIES 8

filantrop
Node Clone

Hi and welcome andrej.zivec,

If is is the number of relationships in the path you are searching for it is.

return size(relationships(p))

If you are looking for a count of the number of relationships in and/or out of each node along a path, the following query will give you a count for each path node:

match p=((a:label {Pro1:'A'})-[*]->(b:label {Pro1:'B'}))
unwind nodes(p) as node
call {
with node
match(c)--(node)
return count(c) as count
}
return node, count

You can change line 5 if you want to restrict to incoming or outgoing relationships, instead of both, as in the above e query.

bidirectional relationships: match(c)--(node)
only incoming relationships: match(c)-->(node)
only outgoing relationships: match(c)<--(node)

Thank you @filantrop and @glilienfield for the answers but maybe I was no so clear what I want ....

Here is an example of the graph:

CREATE (n9:label {Name: "Pro1"})-[:FRIEND]->(:label {Name: "Pro2"})-[:FRIEND]->(:label { Name: "Pro3"})-[:FRIEND]->(n0:label {Name: "Pro4"})-[:FRIEND]->(n1:label {Name: "Pro5"})-[:BROTHER]->(n4:label {Name: "Pro6"})<-[:COWORKER]-(n0)<-[:FRIEND]-(n6:label {Name:"Pro7"}),
(n9)-[:EMPLOYEE]->(:Company {Name: "Com2"})<-[:EMPLOYEE]-(n0)-[:FRIEND]->(n3:label { Name: "Pro9"})-[:FRIEND]->(:label {Name: "Pro8"})-[:BROTHER]->(n6)<-[:COWORKER]-(n4),
(n1)-[:EMPLOYEE]->(n7:Company {Name: "Com1"})<-[:EMPLOYEE]-(n3), (:label {Name: "Pro11"})<-[:EMPLOYEE]-(n7)-[:CEO]->(n9), (n7)-[:EMPLOYEE]->(:label { Name: "Pro10"})

In this graph the node Pro4 has 6 relationships

What I really want is to obtain the count the relationships of the node Pro4 (correct result is 3) in this subgraph (path)

MATCH p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'})) RETURN p

If I do

MATCH p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'})) return size(relationships(p))

The result is 4 relationships for each path from Pro7 to Pro1

If I do

match p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'}))

unwind nodes(p) as node

call {

with node

match(c)--(node)

return count(c) as count

}

return node, count

The result is 6 for node Pro4

How can I count the 3 relationships

Regards

Now I understand. You would like to count the relationships for each node along your subgraph to only nodes that are elements of your subgraph. This turned out to be tricky, because your match results in a list of paths and you want to process collection as paths as one subgraph. I achieved this by using a reduce operation on the individual paths to get a consolidate list of nodes along all the resulting paths from your match statement. The list contains duplicates, which I was not able to remove until the very end, so there is a little bit if inefficiency, as the nodes common to more than one path get processed multiple times. The results are correct even so. I then calculated the number of relationships for each node on your subgraph, but limited the count to only relationships with other nodes on the subgraph. I was able to get the counts you want.

match p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'}))
with reduce(s=, i in collect(nodes(p)) | i + s) as allSubGraphNodes
unwind allSubGraphNodes as node
call {
with node, allSubGraphNodes
match(c)--(node)
where (c in allSubGraphNodes)
return count(c) as count
}
return distinct node, count

here is the output I got:

I was able to figure out a method of removing the redundant nodes earlier in the query to avoid processing them.

match p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'}))
with reduce(s=, i in collect(nodes(p)) | i + s) as allSubGraphNodes
unwind allSubGraphNodes as node
with distinct node, allSubGraphNodes
call {
with node, allSubGraphNodes
match(c)--(node)
where (c in allSubGraphNodes)
return count(c) as count
}
return node, count

Thank you very much ...
I am new to Cypher and my logic is still on tables, maybe because of this I see the query as a sub-graph but as I understand this is a list of paths.
You write "because your match results in a list of paths and you want to process collection as paths as one subgraph" is there another way to do this?
Can I match the result in a graph and use this graph in the next query (like subqueries in SQL)?

Thank you
Regards

No worries, I am also learning neo4j. I wrote an application using a relational DBMS, but naturally fits a graph database, so I got excited when I learned of neo4j. I am in the process of replatforming it.

Cypher basically pattern matches. Each result row is a match of your pattern. In your case, your match statement below finds two paths that match.

match p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'}))

The Neo4j Browser view of the result shows the matching nodes and renders them with all the relationships, so it looks like the result is a subgraph. The actual results of the query are shown under the text or table views. As you can see below, the result consist of two paths.

It is easy to count the number of relationships ingoing/outgoing from a node with the query below:

match(n{key:value})
match(n)-(c)
return count(c)

In your case, you wanted to restrict the relationship count to only those relationships that are part of your query results. Assuming allSubGraphNodes is a list of those nodes resulting from your query, you can modify the above query to restrict the count to just those relationships between the allSubGraphNodes:

match(n{key:value})
match(n)-(c)
where (c in allSubGraphNodes)
return count(c)

To get this to work for your case, we need to take the query results that consists of two separate paths, and convert them into a single list of all the nodes along either path. That is the goal of the following part of the query:

match p=((a:label {Name:'Pro7'})-[*4]->(b:label {Name:'Pro1'}))
with reduce(s=, i in collect(nodes(p)) | i + s) as allSubGraphNodes

To process the list of allSubGraphNodes through the query part that counts the relationships per node, we need to convert the list into rows using the UNWIND clause, then pass the rows and the list of allSubGraphNodes to the next phase of the query using the WITH clause. The DISTINCT clause removes the duplicate nodes that are in multiple paths. That is what this part of the query does:

unwind allSubGraphNodes as node
with distinct node, allSubGraphNodes

To answer your question specifically, it is the WITH clause that allows you to chain query results to form a query pipeline to formulate the final result you want .

You can also execute subqueries using the CALL clause. This allows you to execute a full cypher query block for each row of the outer query's result. The subquery's result will be added to the outer query's result. The subquery can result in zero, one, or N results. If zero, the query will stop. If one, then the one result will be added to the outer query result. If N, then each subquery result will be added to the outer query's result N times. This is done on a per outer query result row. BTW- the subquery can be either to perform more querying to augment the outer query (similar to a join in SQL), or it can be to perform a mutation operation, such as creating new nodes based on the outer query result.

As discussed, cypher results in collections of paths for its results. If this doesn't work for your requirements and you want to work with subgraphs, then you may consider two alternatives. First, you can use one of the client drivers to iteratively walk the subgraph to formulate your subgraph for additional processing. You would do this by getting your root node, then get the child nodes through the root node's relationships. You can continue this process iteratively until you reach the terminal nodes of your subgraph. You can program an iterative algorithm and execute it in one ReadTransaction.

If this is something you do often and performance is an issue, you can consider writing a custom procedure using the neo4j Java API to run the algorithm on the server itself. You would then call the procedure in cypher. For example, you would match the root node in cypher and in the same query, send that node to your procedure and get back your processed subgraph. I have done this for a project I am working on, so if you want to get jumpstarted I would be happy to help. It took me a while to wade through the documentation and get it working. I now have a nice working platform to development and unit test.

Thank you for the detailed explanation. First I have to try your solution in other examples to completely understand it and then do the next step .

It will be much easier if there will be a native Neo4j function (maybe exists but I don't know for it) to transform the path into a graph ... something like
MATCH p=(....)
WITH to_graph(p) AS newgraph
MATCH (...)

so the second MATCH will match the pattern in the newgraph

Thank you
Regards