Neo4j

jonaslm899 · ‎06-08-2020

Hi everyone,

I do believe re-usability in a graph structure should promote uniqueness for attributes (at least, on a theoretical ground) but I'm really making my first steps with Neo4j.

So I was wondering if much more experienced developers could tell me about the pros and cons of using nodes instead of keys and putting the values (and eventual units) in the relation that ties them together ?

E.g:
(:Class {name:"water"})-[:Relation {val:"138",unit:"C"}]->(:Attr {name:"temperature"})

Well of course any attribute is (or can be) theoretically a class. What I really find interesting in this approach:

classes are purely modular
relations are instance (although you could add an Id in properties to get an object)
it feels more natural to keep the business data in the edges and empty the nodes
you can find any object consuming an attribute easily using graph technologies capacities

I would really be delighted to have theoretical and technical feedback on this approach please

pingelsan · ‎06-09-2020

Hi Jonas,
interesting post and indeed very similar to my question from yesterday.
What do you mean by "modular", and why are you talking about classes?
In a labeled property graph, there are no classes (per se). There are nodes with one or more labels, and these labels correspond to classes just in one respect: They make it possible to refer to exactly the group of node with that label, as if they were "instances of a class", but the possible properties of those nodes don't depend in the least on the label. I think it's better to avoid talking about classes for that reason.
Technically it's perfectly valid to put values (and units) as properties on an edge that links to a node with the "meaning" of this pair. However, this way you get a lot of "super nodes" with lots of incoming links, like (:Attr {name:"temperature"}) that may be problematic in querying and visualization.
In "polygon", an ancient graph-based information system from the late 1990s, the modeling went like this:
(:FluidInContainer {name:"My cup of hot water"})-[:HAS_TEMPERATURE]->(:ValueNode {value:25,unit:"°C"})
This is something I had in mind in my other post. This has the charming side effect that you have a kind of reverse index right there inside the graph. Example:
match (p:Person)-[:HAS_GIVENNAME]->(v:ValueNode) where v.value="Christoph" return p
But frankly I myself am still waiting for expert input on the subject.
Where are the pro modelers?
best regards,
Christoph

soham_dhodapka1 · ‎06-09-2020

Let's take a step by step approach to answering this question.
Properties on a Node can be indexed, whereas properties on a relationship cannot be indexed yet.
For eg: having dates on the relationship as property could help while filtering events.
Next, like @pingelsan said, having units of measurement like *C or *F as nodes could lead to extremely dense nodes.

MATCH (p:Person)-[:HAS_GIVENNAME]->(v:ValueNode) WHERE v.value="Christoph"

can be useful in the cases where some kind of entity resolution or de-deplication is taking place where a person with same given name and same SSN is creating multiple accounts or something like that.

MATCH (p:Person{name:"Christoph"}) -[:PURCHASED]-> (i:Item)

Here, having a relationship HAS_GIVENNAME in between Person and Item might not make sense since that is probably not the usecase we'll be looking at.

It will ultimately boil down to what you want to achieve with the data in your graph and the kind of queries you wish to run.

Neo4j

Attributes as nodes?