Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-12-2020 10:51 AM
Hi,
I have a large tree structure and my goal is to find the first intersection between to points.
In the test here I start a point and following a path to another node. Hopefully you can see the picture.
The query returns a path from the target all the way back to base root. The start point is not included on the path. Returned path
class
"A01D41/1276"
"A01D41/127"
"A01D41/12"
"A01D41/00"
"A01D34/00"
"A01D"
"A01"
"A"
Notice the start node, 'A01D41/10', is not on the list.
Second part: There is property key, level (int), associated with each node. How do I return the minimum level (lowest common node) and its ID?
So what is wrong with this query:
MATCH (start:cpc{cpcClass:'A01D41/10'}),(target:cpc{cpcClass:'A01D41/1276'})
CALL algo.dfs.stream('cpc', 'Reports_to', 'start', Id(target)) YIELD nodeIds
UNWIND nodeIds as nodeId
RETURN algo.asNode(nodeId).cpcClass as class
03-12-2020 11:13 AM
Which version of the graph algos library are you using? Also I don't see any place you've provided the start node or its id, you've given a string 'start' which isn't associated with your start
variable.
You should review the signature of the procedure:
call dbms.procedures() yield name, signature
where name contains 'dfs.stream'
return name, signature
03-12-2020 11:37 AM
I have Graph Algorithms vers 3.5.14.0 installed.
The documentation gives this
algo.dfs.stream
CALL algo.dfs.stream(label:String, relationshipType:String, startNodeId:long, direction:Direction, {writeProperty:String, target:long, maxDepth:long, weightProperty:String, maxCost:double}) YIELD nodeId
From that I am guessing that I need to pass:
label which is 'cpc as string
realationshipType which is 'Reports_to' as string
Start Node ID as long which I try as ID(start) but this returns error.
direction is a bit unclear since I want both ways though the relationship Reports_to is directional
I can't find a definition for write property so I am at bit of a loss?
The signature returned is which is even more cryptic and does not match the documentation.
algo.dfs.stream" "algo.dfs.stream(label :: STRING?, relationshipType :: STRING?, direction :: STRING?, startNodeId :: INTEGER?, config = {} :: MAP?) :: (startNodeId :: INTEGER?, nodeIds :: LIST? OF INTEGER?, path :: PATH?)"
Note: I am learning at this and unfortunately the "Graph Algorithms" Book NEO4J recommends does not have an example of this algorithm. It talks about it but does not give example which would help with the exact syntax needed.
Andy
03-12-2020 04:19 PM
I think you gave the version of Neo4j, not the graph algos version, can you check again?
For direction, there are a number of accepted values, but "OUTGOING", "INCOMING" or "BOTH" will work.
I'll check with the algos team about the signature and documented param differences, that will require a documentation update, but what you're seeing from the query I provided should be the source of truth for the ordering.
See if this will work for you:
MATCH (start:cpc{cpcClass:'A01D41/10'}),(target:cpc{cpcClass:'A01D41/1276'})
CALL algo.dfs.stream('cpc', 'Reports_to', 'BOTH', id(start), {targetNodes:[id(target)]}) YIELD nodeIds
UNWIND nodeIds as nodeId
RETURN algo.asNode(nodeId).cpcClass as class
03-12-2020 04:36 PM
Hi Andrew,
I gave it a try and it did work thank you.
The results are bit different then my naive expectation, but that is on me.
Andy
03-12-2020 04:44 PM
If you could clarify again exactly what you're trying to find, that may help us give you a better solution. Simply finding a dfs path between the nodes seemed like a first step and not the end result.
03-13-2020 06:40 AM
Hi Andrew,
What I am looking for is the actually the deepest node. Without trying to give too much away. I am looking at patents and there is prior work on the topic though I think it did not fully exploit graph theory and my goal is strategies that use other ML techniques to create additional meta data that create new and additional relationships and additional strength to the graph. So with that premise.
I have two master nodes A & B and each of them will have certain classifications assigned to each of them with the number of classifications varying from 1 to as much as 200 though typically in the range of 2-10. The classifications are a pure tree graph. So for any two pair of master nodes what is the deepest deepest node for any pair of classification nodes. I actually don't need the whole path just the deepest node. The end point is a table of classification nodes for each pair and deepest node. I could impose a limit depth of 3 or from each side returning null if there is no common node within that number of levels. This will be input into other analysis tools.
Endpoint is a table for export.
Master A Master B Deepest
XXLB XXLA XXL
XXLBD XXLBF XXLB
XXLBD XXLA XXL
On Top of this is that the master nodes come from two holding groups and I would then do the pairwise look on each pair from the holding groups. There could be a fair amount of redundancy in the classification pairs so if there is a strategy of once the classification pair is identified with its deepest common node that result could be copied that may be a faster search strategy. But at the moment I working of looking if this search algorithm brings value.
Andy
03-13-2020 06:52 AM
I see, so in your image, C would be the deepest common node for your start and target nodes.
If your query is only finding the common node for two input nodes, then this is easy: first match the entire path for one of the nodes and collect all nodes in that path as a list, then match from the other node such that the end node must be in that list, and limit to a single result. Cypher expansions by default use depth first expansion, so (assuming no cycles in the graph for this relationship) the first hit will be the deepest common node.
For your query:
MATCH (start:cpc{cpcClass:'A01D41/10'})-[:Reports_to*0..]->(ancestor)
WITH collect(ancestor) as ancestors
OPTIONAL MATCH (target:cpc{cpcClass:'A01D41/1276'})-[:Reports_to*0..]->(ancestor)
WHERE ancestor IN ancestors
RETURN ancestor
LIMIT 1
03-12-2020 06:25 PM
Also a follow-up, the graph algos library is now deprecated, take a look at the Graph Data Science library instead.
Algos should have been ported to this new library, with docs and bug fixes applied, though you should review the documentation as some of the procs and usage have changed.
03-13-2020 09:24 AM
Hi,
Thank you it seems to work. Now I need to work to get all the cpc codes for a given patent.
To give you a picture of what is happening.
Code C is shared by both fine.
The A-D path yield one a common ancestor while the B-D path has a more distant common ancestor. The paths A-C and D-C do not have a common ancestor.
So the challenge now is craft a query that returns all the pairs with NULL if no common ancestor exists.
Thank you again.
Andy
All the sessions of the conference are now available online