Neo4j

michalkomorowsk · ‎01-09-2020

Let's assume that we have 2 types of nodes i.e. Node1 and Node2. They can be connected via BELONGS_TO relationship e.g.:

(:Node1) - [:BELONGS_TO] -> (:Node2).

Now we want to associate additional information with each BELONGS_TO relationship. Let's call it context. This context will be used for querying/filtering. We see 3 possibilities of how to implement that:

Add a property to the edge e.g.

(:Node1) - [:BELONGS_TO {context: 'XXX'}] -> (:Node2)

Use a context as a label e.g.

(:Node1) - [:XXX] -> (:Node2)

Add an intermediate node inbetween e.g.

(:Node1) - [:BELONGS_TO] -> (:Context { name: 'XXX'}) -[:BELONGS_TO]-> (:Node2)

The estimated number of different contexts is around 100 thousands.

We read that the 1st solution is not optimal while properties of relationships cannot be indexed.

We suspect that it might be better/faster to have intermediate nodes than having 100 thousands of types of relationships. On the other hand, solution 2 seems easier to grasp and maintain.

To sum up, which solution is the way to go if we don't want to deteriorate the performance of queries?

MuddyBootsCode · ‎01-09-2020

No problem, yes that should work the same way. However, you might want to double check and see how many different relationship types can be assigned in your neo4j instance. If I remember correctly, you can only have 65k different types of relationships in your instance. So this might not work for you in your use case unless you simplify your relationships a bit, in that case method 3 would be the way to go.

View solution in original post

MuddyBootsCode · ‎01-09-2020

Welcome to the community Michael. You're correct about method number one not being optimal because you're not able to index the relationship, although if you have the id's of node1 & node2 available you can still do it pretty quickly. The most commonly suggested way to accomplish what you're after is method 3 with a node representing the context included in the relationship. However, just to muddle up the explanation a bit, if your context is only one property, then you could easily represent it by the relationship joining the nodes as in method 2.

So the the short answer is, if you can model the context with a relationship, choose method 2. If you need more information from the context use method 3.

MuddyBootsCode · ‎01-09-2020

To clarify, judging from the example you provided:

Method 2 would probably looks something like:

(:Node1)-[:OWNS]->(:Node2)
(:Node1)-[:LEASES]->(:Node2)
(:Node1)-[:RENTS]->(:Node2)
(:Node1)-[:SHARES]->(:Node2)

etc.

michalkomorowsk · ‎01-09-2020

@MuddyBootsCode, thanks for the quick response.

Yes, in our case the context = just one property. However, this property is a kind of identifier. So method 2 will look as follows

(:Node1)-[:ContextId1]->(:Node2)
(:Node1)-[:ContextId2]->(:Node2)
(:Node1)-[:ContextId3]->(:Node2)
...
(:Node1)-[:ContextIdN]->(:Node2)

Where N is around 100 thousands.

Do you still think that Method 2 is ok in this case?

MuddyBootsCode · ‎01-09-2020

No problem, yes that should work the same way. However, you might want to double check and see how many different relationship types can be assigned in your neo4j instance. If I remember correctly, you can only have 65k different types of relationships in your instance. So this might not work for you in your use case unless you simplify your relationships a bit, in that case method 3 would be the way to go.

Neo4j

What is a preferable way to model a context of relationships?