Neo4j

jkratz55 · ‎01-07-2021

Hello, I'm new to Neo4j and Graph databases and was hoping to pick the Neo4j community collective brain on best practices for using relationship properties vs node properties. I've included a diagram to hopefully help explain what is in my mind. What I am trying to achieve is to be able to fetch all products in a given collection. A collection can have one or more products, and each product has a plan (effectively when the product is available for sale) and pricing. The availability/planning and pricing vary by country so a product can have many plans and many different prices. I will need to query the plan based on the floorPlan, startDate, and endDate properties. Similar for pricing I need to query on the country/currency. Where I am racking my brain a little is understanding if it is better to have those as relationship properties or node properties, which is more optimal? I was thinking if they are relationship properties maybe it would prevent the relationship from being traversed in the first place if it doesn't need to be. Any feedback or thoughts is appreciated.

clem · ‎01-07-2021

To add more things to think about...

It may be better in some cases, is instead of a property use a Label. It's often faster to search by Labels. What a lot of people don't know, is a Node can have multiple Labels. However, this probably won't make sense for something fine grained like Price.

So, in the Movie DB, you could (hypothetically) have:

CREATE (p:Person:Director:Actor {name:"Clint Eastwood"})

BUT... on the other hand, if an Actor who has never directed before and you add their director info, you have to remember that you need to add the Director label to their node.

Relationships can have only one Label.

I think a lot of this hinges on what sorts of queries you want to make. Often this is hard to predict ahead of time, which is why Neo4J is great because you have a lot of flexibility in adding to or modifying your DB.

It's something you have to experiment with. As you gain experience, you'll be able to make better decisions.

dominicvivek06 · ‎01-07-2021

Hi @jkratz55 , welcome to neo4j community.
my 2 cents - your data model is both one-to-many and time-based (kinda).

one-to-many - collection -> product
a collection has (0) to n product. Since product has various attributes and composite attributes are unique, its best to have them as properties of a node.

time-based (kinda) - product -> plan
from your model, you have startDate and endDate.
question - if you don't need to plan after the endDate (expired), you can add a boolean attribute like isActive-False. this predicate will have direct good impact on the performance. also alternativley, you can have the statue -> active has a property in HAS_PLAN. both suggestions can be mutual exclusive.

product -> price
this is an one-to-many -

are you going to query the price of all the countries ?
the price of country can change over time (i am guessing that would be the case). so, again test with mutual exclusive of having a property in the node or in relationship.

jkratz55 · ‎01-08-2021

Thanks for the reply and feedback!

Regarding plans, there are a couple of reasons why I don't believe I can't make a isActive property on the Product node:

Product plans are country-specific. As an example, a product may be in plan for US from Jan 1 to Dec 31 and not in plan at all for the UK, or vice versa. It is also possible that a product is in plan in multiple countries but they have different start and end dates. So when we want to query all the products in a collection we need to do so by taking the country, startDate and endDate into account. The question I need the Neo4j to answer with a query is, What are all the products in a given collection that are in plan for the country and current datetime, and what is the price for that product for the country.
We support a timeboxing feature, where the datetime can be overridden to the past or future. This is used by people setting up the content to preview the product catalog and content for the future.

Because we need to ask if a product is in plan by country, startDate, and endDate, I wonder if those properties are best on the Plan node, or on the HAS_PLAN relationship?

dana_canzano · ‎01-08-2021

although I havent come up to full speed on all of the above, one can index a Node property with a simple

create index on :<node label>(<property>);

but not so on a relationship property.

so if your queries include a lot of

WHERE <property>=?????

then it may be better to be a Node property and thus get benefit of the index

Neo4j

Use cases for node properties vs relationship properties