Neo4j

t_j_schonborn · ‎10-19-2021

Dear Neo4j community,

I'm very new to the topic of graph databases and big data analysis in general. I'm working on a feature that visualizes different patient trajectories in a tree. A patient's trajectory is basically a series of events of a certain type that have been performed: (consultation)-[next]->(surgery)-[next]->(consultation). I want to combine many of these simple paths to form a tree by aggregating emanating vertices from a certain start point. In Gremlin the .tree() step does this. My trees will be very large, the average patient trajectory has about 300 steps. And per diagnosis, I have a few thousand patients.

My initial approach precomputed the trees for each diagnosis, which worked fine. However, I want to add more dynamic filtering, instead of showing a tree for all COPD patients, I want to show only female patients, that also have diabetes and are below the age of 40. This is where I switched my approach to use a graph database.

I've built a prototype on CosmosDB but after finally setting everything up, I ran into an issue where the maximum .repeat() depth can be no more than 32, far below the path length I have.

Therefore, I'm now looking for different ways to solve this. My first question is whether you think Neo4j can handle this operation without running into limits? I don't expect the graphs to be build instantly, but it should definitely be < 10 seconds.
I'm also wondering if a graph db is the right approach at all, how would you approach this problem? I'm extremely thankful for any advice or ideas, as I said I'm very new to this topic but am extremely eager to learn!

Neo4j

Advice building trees from very long paths