cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Adding nodes to correct graph branch

Please have a look at a model below to understand the problem encountered. A question regards adding new tail nodes to the model here 'X' node.

New nodes are to be fetched via a data file containing a dozen of rows with each line representing one node.

Parent node for a 'X' labeled node is a one labeled '05'. Only the problem is that there are two '05' labeled nodes.

As in the model shown below, siblings at level 2 are by design and planned.

Let us assume that a particular row contains instruct: Create and add a new node 'X' to '05' labeled node but assuming that the '05' node is a child node of the node labeled '350'.

What would be the most elegant and effective approach in writing a simple cypher that create and add new node that targets correct parent. i.e. '350' and not '250'?

8 REPLIES 8

ameyasoft
Graph Maven
Assuming you have the parent info for "50":
 
MATCH (a:N1 {val: toInteger("350")})
MATCH (b:N2 {val: toInteger("50")})
MATCH (a)-[]-(b)

MERGE (c:N3 {val: x})
MERGE (b)-[]-(c)

my business case requires an outcome as on figure below where both 250 and 350 branches get assigned a dedicated their own 'X' nodes. Those 'Xs' are going to get connects_to relationship afterwards also.

, and not as it gets out while using suggested script and as shown below
. I need some kind og WHERE or optional match arrangement to check if input suggest node create or merge

What is the condition for which the node would be merged (or only a single node created and relationships created for two parents)?

If while reading an input file, the encountered '05' node, satisfies / overlap all relationships from say node '350' over node '05' hits an existing node X, than do nothing (verifies and handles correct by issuing MERGE I assume), else say instruct describe relationship that matches branch ('250')-[:HAS]-('05') and cypher query hits it, but branch is missing a bottom '05' node then create a node.
Simply use case seeks and allows for several '05' nodes for different branches, but allows only one per particular branch. Correct solution would be top figure in my answer above and wrong as in bottom figure.

Since the path to the node needs to be considered to uniquely identify it as the one to modify, then it makes sense that each node should also contain a property representing the full path to it, and that can be indexed and used for a quick lookup.

So assuming 350 and 250 are properties on each root node, the 05 nodes would have a property like "250/05" and "350/05" respectively. With a list of path elements, you can apoc.text.join() the list using a slash delimiter to get the identifier string to use in your MATCH. If the property is indexed, that will be a fast lookup.

Other (less efficient and more complex to use) options include MATCHing to candidates and comparing the path to root node to filter to the correct one, or MATCHing down from the root to the one in question.

Look I am inexperienced with graphs, but to me your proposal worries me regarding data redundancy, as you suggest that child nodes makes aware of theirs parents through a dedicated property i.e. '350/05', while at same time this information can be queried by examining relations which I thought was graph way in deriving 'paths' and node networks. Am I wrong and please do elaborate on this dilemma.

I think there's some misunderstanding about your use case. Your starting post was about adding an "X" node only to one particular branch, but then your followup comment is about adding an X node onto all 05 nodes regardless of branch.

My original comment was about your first case, how to identify the branch for which to add the X node (and not include any other). As mentioned there are several ways to do it, but the fastest and easiest way is to save the "path" as a property, construct the "path" for the new node based on the requirements, and MERGE the new node to it.

For your second case, how to add an X node onto all 05 nodes (where each one gets its own X node if it needs one), then you can do that easily:

MATCH (n:Node {value: "05"})
MERGE (n)-[:HAS]->(:Node {value:"X"})

The second MERGE is executed per n from the first MATCH. If no such "X" node is found to fit the MERGE pattern, then a new :Node with the value "X" will be created (it won't reuse some other "X" node in the graph).

Regarding the data redundancy concern, as mentioned there are other approaches, but this does hinge on your use case. Most notable, is the path length stated (always 2 nodes then the new node) or is it dynamic?

If it's dynamic, that's not exactly easy to express and filter during traversal, since we don't really have a looping mechanism in Cypher for this. We can MATCH on potential end nodes, MATCH back to the root, and filter to the correct branch based on the properties of the nodes on the branch. Or we can have each node keep its path as a property and use an indexed lookup.

It is possible to MATCH down from the root, but it's not so straight forward. It would require dividing up the path into subsequences, and doing a MATCH on each subsequence, and finding the longest one, then doing a MERGE on the rest of the path. That can also handle cases where we may not know how much of the path currently exists.