Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-08-2019 11:12 AM
We have a service that process a potential "match" between a user (Profile) and a Project (a paid opportunity)
In the graph we create relationship with a score property.
The number of relationships created to the :Project node could be 20k or more
Some stats about the data
Right now we have the following model:
MATCH (p:Project{id:""})<-[r:MATCHES]-(pm:ProjectMatch)
MATCH (profile:Profile)-[:MATCH_PROJECT]->(pm:ProjectMatch)
RETURN profile
order by r.score desc
r contains the score between the project and the profile
ProjectMatch is a node created for each month and year for a specific profile
year: 2019
month: 8
profileId: ""
We've experienced slow queries i.e to get all matches ordered by score which made us rethink the model and to potentially simplify it to just:
MATCH (p:Project{id:""})<-[r:MATCHES]-(pm:ProjectMatch)
MATCH (profile:Profile)-[:MATCH_PROJECT]->(pm:ProjectMatch)
RETURN profile
order by r.score desc
I am finding the same number of dbhits or very similar between the 2 models. Any advice?
Which data model is "better" or is supposed to perform better? Is it scalable in the long run?
Query we run
WHERE NOT ((profile)-[:HAS_EMAILS]->(:Emails)-[:SENT]->(:Email{projectId: ""}))
11-11-2019 06:03 AM
You may find introducing some of the elements from your ProjectMatch node into relationships from the Profile to ProjectMatch and incorporating them into your queries will help speed up your queries.
For example, let's say we take the properties year and month from ProjectMatch, and have something like:
(Profile)-[:MATCH_PROJECT_2019_11]->(ProjectMatch) and we use that specific relationship type, you can filter down the number of relationships that need to be traversed to get to ProjectMatch and Project. Of course this only works if you can be specific with dates, but it gives you an idea of how you can use more fine-grained relationship types to speed up queries.
I would recommend you have a look at the following for some modelling tips and tricks:
11-12-2019 07:12 AM
Thanks @lju
That wouldn't work for us for example:
What if you're on November 1st and you need matches from last month (a day ago)? You would need to dynamically generate the 2 relationships names
MATCH_PROJECT_2019_10|MATCH_PROJECT_2019_11
Does the current modelling make sense though?
Should we just keep the structure and filter ProjectMatch
by year and month?
(Profile)-[:MATCH_PROJECT]->(pm:ProjectMatch{year: 2019})
WHERE pm.month = 10 OR pm.month = 11
All the sessions of the conference are now available online