Neo4j

matthias_richte · ‎01-14-2020

I am trying to model complex product descriptions and got a bit stuck, so I could need some pushing into the right direction.

What do I mean by complex product description? The data I want to model is modelled according to the ISO 13584 / IEC 61360 dictionary meta model. This means I have instances of particular entities that together make up a structured model of classes of products and their classification.

The classification part was easily solved, it is a tree of (:CategorizationClasses) with [:IS_A] hierarchical relationships between them. When I want to classify a product I can link it to the best matching (:CategorizationClass) and am done. Actually it feels sort of wrong to directly connect "instances" with the "model" in one graph without some separation layer or mechanism, but I can stand this pain for the while.

The characterization part was much more difficult to handle and to make it easier to understand I reduce the model for now to: (:CharacterizationClass) that have a [:DESCRIBED_BY]-> relationship with (:Characteristic) and they can have a [:CLASSIFIED_AS]-> relationship to an (:CategorizationClass). Some of the (:Characteristic) have again a [:REFERENCED_CLASS]-> relationship with another (:CharacterizationClass). By these two relationships we get an arbitrarily deep tree-ish graph which ends in a lot of (:Characteristics) nodes or "contexts". I say tree-ish because there is lot of re-use of elements ongoing inside the graph, so in most cases it isn't really a tree anymore.

Now I have to Model my products and somehow need to relate elegantly to paths to a (:Characteristic). Such paths start at an (:CharacterizationClass)-[:CLASSIFIED_AS]->(:CategorizationClass] and end in (:Characteristic) leaf nodes that have no further [:REFERENCED_CLASS]. The path context is where values of the characteristic would need to be recorded for the product description but I have not yet found a graphy and elegant solution on how to get this done properly. So ideas / pointers for reading would be very much appreciated.

To have an example with more real world data: Assume we have a (Table:CharacterizationClass). It is [:DESCRIBED_BY]->(board:Characteristic)-[:REFERENCED_CLASS]->(board:CharacterizationClass)-[:DESCRIBED_BY]->(length:Characteristic) and [:DESCRIBED_BY]->(leg:Characteristic)-[:REFERENCED_CLASS]->(leg:CharacterizationClass)-[:DESCRIBED_BY]->(length:Characteristic). I get values for the length of the leg and of the board that i need to store separately and distinctively.

What I could of course do is create properties on my (:ProductDescription) that literally correspond to the path I need to traverse. While this strategy is proven to work in other representations, it feels more forced to fit than convenient or graphy.

What I could also do is replicate the model structure in my (:ProductDescription) and maybe make back relationships with the model at each level but this still feels inelegant and inconvenient, though -at least on paper- somewhat graphy.

david_allen · ‎01-14-2020

I'd like to answer here but this question is a bit too abstract to provide real help with. Some things I'm wondering:

What questions do you want to ask of whatever your resulting model is? How does this model help you?
Can you give more concrete examples of products you're trying to describe and what kinds of characteristics they'd have? This would intersect with the answer to the first question. I found the table example a little bit confusing because I can't see how to use the length of a plank and the length of a leg as two separate facets, although sure, those things would be valid and measurable.

What I could of course do is create properties on my (:ProductDescription) that literally correspond to the path I need to traverse. While this strategy is proven to work in other representations, it feels more forced to fit than convenient or graphy.

Not feeling like I totally understand what you're saying, but this seems like the wrong approach. Graphs are meant to be traversed by relationship type and so forth. If you're putting a property on a node to inform what traversal to do, something is definitely awkward and may not be right about your model.

matthias_richte · ‎01-15-2020

Thank you for your reply.

What questions do you want to ask of whatever your resulting model is?

The obvious questions would be related to matches (retrieve data on products that fulfill certain requirements) and to a further abstraction layer (product individuals such as equipments, digital twins, ...). One of the first questions I want to answer is whether a graph can represent such product descriptions really well enough (I already sort of assume yes) and if so, how.

How does this model help you?

As far as the graph representation is concerned I don't understand this fully yet, I'm currently trying to explore possibilities. Regarding the underlying meta model, well it is a standardised knowledge source and I hear a lot about data integration and knowledge graphs being the way, so I try to learn and tap into this source. A good question is whether enough semantics can be kept in the property graph to support potential use cases that come up well enough.I need to proceed further to soundly answer this question and then bring up the real questions that will appear.

Can you give more concrete examples of products you're trying to describe and what kinds of characteristics they'd have?

I tried to construct a minimal example where my problem already occurs because the real data is much more complex. Full examples would be contained in standards such as the ones published in the IEC CDD or eCl@ss advanced. Fully describing e.g. a single measurement device from these reference dictioniaries could easily fill a couple of houndred pages and there would be involved even more building patterns. That I wanted to avoid as essentially it can be boiled down to contextual paths to characteristics.

This would intersect with the answer to the first question. I found the table example a little bit
confusing because I can't see how to use the length of a plank and the length of a leg as two
separate facets, although sure, those things would be valid and measurable.

Maybe I oversimplified the model but overdid the language. The model says that a table consists of board and legs. And there is a characteristic "length" that is describing each of them. So when I have the graph from above, I can't just speak about a length but I need to speak about length of a leg of table or length of the board of the table. What happens when I want to describe a table product is that I need to say that there is e.g. a specified value of 100 in the unit cm for the board and a specified value of 70 in the unit cm for the leg (now simplifying away that a table can have a variable number of legs which could each have different length and that there may be tolerances on these values specified as well as conditions of the measurement or on the value) and I need to store this in a way that leaves each value in its context.

Further thoughts have lead me to the conclusion that I will most probably need to follow the second of the initual options and should either kind of rebuild the structure of the model in the structure of the product description layer or create intermediate nodes for each path to a characteristic whenever I need them (can't do this in advance because there is in some cases an infinite number of possibilities which must not be unrolled without further knowledge).

Creating node properties out of the paths to the characteristics would indeed be a bad choice.

david_allen · ‎01-15-2020

Graphs are great for knowledge management, let me first address this part:

A good question is whether enough semantics can be kept in the property graph to support potential use cases that come up well enough.

The answer to this is unambiguous yes, property graphs are great at this. I've done several projects in semantics & knowledge graphs before, and by far the bigger challenge is being precise about your semantics, there's no issue whatsoever in the representation once you have that precision. But it is a common thing to see that if the semantic description of a domain is a bit loose, one finds that no representation feels like it quite fits.

On this axis, a thing that occurs to me about the model you've presented is that it seems to be lacking the notion of a "type of measurement" (e.g. "length", or "density") and it also lacks the notion of a "metric" (that is, 30cm, or 20kg, or 5 degrees celsius). These are very distinct concepts from how you've represented "Characteristic", because the notion of "characteristic" will include lots of other concepts that aren't measurements or metrics (or are at least special cases of them). Examples of those might be categoricals like color or size -- or booleans like hazmat or not.

This domain modeling exercise is pretty huge actually -- and it sounds like maybe you want to derive that straight from your industry standard spec. (I'm not familiar at all with the IEC CDD or eCI@ss)

Back to your original question:

Now I have to Model my products and somehow need to relate elegantly to paths to a (:Characteristic). Such paths start at an (:CharacterizationClass)-[:CLASSIFIED_AS]->(:CategorizationClass] and end in (:Characteristic) leaf nodes that have no further [:REFERENCED_CLASS]. The path context is where values of the characteristic would need to be recorded for the product description but I have not yet found a graphy and elegant solution on how to get this done properly.

So it seems the seed of your answer is already in there. You have something a thing (table) related to a characterization class (let's say "dining room object", whatever). That characterization class itself participates in a taxonomy. So "dining room object" is a subclass of "household object", whatever.

Dining room objects have to have certain characteristics, like length and weight. (Maybe we're a moving company, so physical dimensions & weight is what's important for our movers). So I think the modification is that you link the thing (Table) to the characteristics, and you do not link the characterization class to characteristics.

You can have a parallel taxonomy of characteristics. For example length is a subclass of physical measurement. (Width might be a different subclass of physical measurement). In this way, you could ask a question like, "Give me a list of all objects which have a physical measurement, denominated in cm, which exceeds some value". This gives you the list of objects which won't fit in the front door.

matthias_richte · ‎01-16-2020

david.allen:

On this axis, a thing that occurs to me about the model you've presented is that it seems to be lacking the notion of a "type of measurement" (e.g. "length", or "density") and it also lacks the notion of a "metric" (that is, 30cm, or 20kg, or 5 degrees celsius). These are very distinct concepts from how you've represented "Characteristic", because the notion of "characteristic" will include lots of other concepts that aren't measurements or metrics (or are at least special cases of them). Examples of those might be categoricals like color or size -- or booleans like hazmat or not.

This domain modeling exercise is pretty huge actually -- and it sounds like maybe you want to derive that straight from your industry standard spec. (I'm not familiar at all with the IEC CDD or eCI@ss)

You are right, of course. This is an important part of the complete model but was not in focus of my question. We can expect the characteristics to actually be of certain data types which cover all you mentioned; also there are quite sophisticated unit and quantity systems available / included, so I don't worry about this.

Indeed it is big, therefore an automated conversion/derivation is the only feasible way for me. This has the benefit that I have to think first what I will be doing before I'm doing it

Yes, I even have data on that from two sources, the quanties of the units of measure and a classification of characteristics (https://www.google.com/search?q=DET+CLASSIFICATION+site%3Acdd.iec.ch)

Neo4j

Managing complex structured product descriptions