cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

[Cypher] Creating an hyperedge with exact series of relationships

fabio
Node Link

TLDR: Is there a way to create a node (or to better say an hyperedge) only if it doesn't already exists but exactly with the specified series of relationships?

 

The long version is this one: suppose you have the following graph model, in which you describe different attacks. For each one of them you relate also the consequences of that particular attack (i.e. one "Attack" can have multiple "Consequence" node). Each consequence is also related to particular aspects such as security properties or impacts.

1)1.png

So, to create this particular example I would do something like (suppose that the "Attack" node already exists):

MATCH (a:Attack {name:"Attack1"})
MERGE (i:Impact {name:"Execute unauthorized code"})
MERGE (p1:Property {name:"Confidentiality"})
MERGE (p2:Property {name:"Availability"})
MERGE (c:Consequence)-[:HAS_IMPACT]->(i)
MERGE (c)-[:AFFECTS]->(p1)
MERGE (c)-[:AFFECTS]->(p2)
MERGE (a)-[:HAS_CONSEQUENCE]->(c)

Now, suppose you want to add a second attack like this one:

2)3.png

If I run the following Cypher code I don't get the expected result:

MATCH (a:Attack {name:"Attack2"})
MERGE (i:Impact {name:"Execute unauthorized code"})
MERGE (p:Property {name:"Confidentiality"})
MERGE (c:Consequence)-[:HAS_IMPACT]->(i)
MERGE (c)-[:AFFECTS]->(p)
MERGE (a)-[:HAS_CONSEQUENCE]->(c)

Basically due to the fact that the consequence of the "Attack2" is a subset of the "Attack1" I get this graph:

3)4.png

To solve this problem I could use "CREATE" instead of "MERGE" when creating the "Consequence" node, but even this solution is not the perfect one, since it will create every time new nodes even if the right one already exists (with "right one" I mean the one that have the exact properties/impacts already related).

So, I was wondering if there's a specific approach to this kind of problem/situation.

8 REPLIES 8

Are you stating that you want to use an existing 'Consequence' node if it has the same set of 'impact' and 'property' nodes as the new 'Attack' node will have? In your example, Attack1 is related to all three nodes, while Attack2 is related to only two node, thus resulting in a new Consequence node? 

Yeah, that's exactly my goal! 

That is complicated.  As an input to the query, can you provide a list of the 'impact' and 'property' nodes and their properties that can be used to determine if they exists? Will there be additional properties to persist with these nodes outside the ones used for matching? If so, they need to be differentiated. 

At the time of creation I can easily provide a list with all the properties of the "impact" and "property" nodes related to a particular "consequence". A different story is when I will be querying the graph, since I will only have the "attack" name (that is unique) and I would like to retrieve those nodes.

The only property that I didn't mentioned before and that might be presents is a "description" of the "consequence" node. I said might because it's not always provided and sadly that's most of the cases.

I thought you wanted help on the query to build the grand , given a new Attack node.  The query to get the consequence and other nodes for a given attack is straight forward. Is this what you were referring to?

What information is needed to identify the impact and property nodes? Is it just the label and the name, or are there more properties to match. Performance will degrade with more properties to match on? 

Will the impact and property nodes exist or do they need to be created if they don’t exist? 

If you have the description of the Consequence node and all had descriptions, would that be enough to know you had one that could be used or need to create a new one? Or, do you always need to compare the Impact and Property nodes? 

Yeah, you're right, the help is in regarding the query to build the nodes. I mentioned also the part for retrieving them because I was mislead by the term "query" in your last post, my bad 😅 (and I totally agree with you on the fact that retrieving them is quite easy given an "attack" node).

 

What information is needed to identify the impact and property nodes? Is it just the label and the name, or are there more properties to match. Performance will degrade with more properties to match on?

I've just the label and the name for those nodes, so no additional properties (sadly). 

 


Will the impact and property nodes exist or do they need to be created if they don’t exist? 

Yeah, if they don't exists they need to be created.

 


If you have the description of the Consequence node and all had descriptions, would that be enough to know you had one that could be used or need to create a new one? Or, do you always need to compare the Impact and Property nodes? 


No, I don't feel like that the "description" will be enough to decide that. I think the second way is better, especially cause the "description" is only provided for a handful of attacks

So, we will have something like this to define the query inputs:

{
  "attack": "attack1",
  "nodes": [
    {
      "label": "Impact",
      "name": "name1"
    },
    {
      "label": "Property",
      "name": "name2"
    },
    {
      "label": "Property",
      "name": "name3"
    }
  ]
}

Yeah, that's sounds correct!