Neo4j

ovidiu · ‎04-17-2020

Hello everyone, I would like to model our graph to support content in multiple languages. The problem is that we do not know which properties will get filled and the number and type of languages will be also variable for the properties. Can you suggest me a modelling example for content in multiple languages?

For example we have (TERM)-[:defines]->(RISK):
The TERM entity with properties such as CATEGORY, TYPE, DEFINITION, etc. which could be in English, French, Spanish, German, or other language, and the RISK entity which is filled by the user with properties such as VALUE, DESCRIPTION, etc. which are filled in different languages, depending on the user's language.

How should we best proceed with it?
Thanks in advance,
Ovidiu

MintyOrb · ‎05-02-2020

In my own project I modeles this as:

(term)-[:has_translation]->(properties)

With the has_translation including a property languageCode.

While this worked, it may not be ideal. Talking with someone with much more experience than me recently, they suggested the following:

(term)-[property_translations]->(language { languageCode: 'en' })

i.e. putting the property values in the edge and having a single node for each supported language.

Depending on the use case and specifics of your graph, it sounds like category and type could also be modeled as separate nodes with relationships rather than as property values.

ovidiu · ‎05-02-2020

Thanks @MintyOrb for your input!

You are right, category and type could be also separate nodes and we are truly thinking in the entire graph where it makes sense to create additional nodes and where to stick with the property.

This idea with the property on the edge is indeed a totally different way of viewing the topic but I wonder what is the sense of having an inflation of vertices? And what should then the language contain in this case?

MintyOrb · ‎05-02-2020

Unless you put all of the translations on the node itself (ex enDescription, esDescription, etc as properties), you're going to see proliferation of nodes and vertices either way.

My understanding is that if you're tempted to use arrays as a property, it's probably a good candidate to be broken off into relationships. When to use properties vs relationships is a matter of how you want to query the graph.

In the case of the second example I gave, the
langue node would just contain the language code as a property:

Neo4j

Example of graph modelling for content in multiple languages