cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Understanding performance of basic graph traversal

Hi everyone,

I'm new to Neo4j, but I have read the O'Reilly book and some of the Cypher manual. I am familiar with Graphs in general (from a mathematical perspective). I am getting confused with the basics of traversal performance in Neo4j.

Context

I am making a Knowledge Management system, consisting of Entities with Key/Value Properties. These Values are often going to be other Entities in the system.

For example, the user can store the Simpsons children, with Bart['sister'] = Lisa, and Maggie['brother'] = Bart etc;

@Bart
    sister:@Lisa
    sister:@Maggie

@Lisa
    brother:@Bart
    sister:@Maggie

@Maggie
    brother:@Bart
    sister:@Lisa

I need the Keys to be able arranged into a flexible, user-defined hierarchy.
So they can also say that sister and brother are sub-types of sibling.
Then, if they ask for Bart['sibling'] they should get [Maggie, Lisa], and Lisa['sibling'] should get [Maggie, Bart].

I also want to be able to get all Entities with specific Properties, so HAS('brother') would get [Maggie, Lisa], and HAS('sibling') would get them all (even though none directly has a sibling property).

My model

I am modelling this in Neo4j with nodes like (b:Entity {name:'Bart'}) and (s:Property {name:'sister'}). The inheritance is modeled as (p:Property {name:'brother'})-[:IS_A]->(p:Property {name:'sibling'}).
To model a property having a particular value, I introduce a Context node which connects up an Entity, Key, and Value;

(sister)-[:VALUE]->(context)-[:VALUE]->(maggie)
(context)-[:CONTEXT]->(lisa)

JSON export of this if I'm not being clear enough: records.txt (1.0 KB)

Performance

This encodes enough information to allow all the queries I want to be able to do, but I am worried/confused about the performance. In particular, basic "JSON-style" employee['manager']['manager']['personal assistant']['phone number'] I feel like should be O(1) or close to it, if the keys are unique. This is how it would be if the graph was built with maps/dicts.
The docs all seem to say that this type of pointer chasing is fast, but the actual implementation seems to be that a Node stores it's Relationships in a linked list; so every time you follow a Relationship you actually have to loop through every Relationship until you find the one you are looking for? I feel like that's going to be a problem here because things like (n:Property {name:'DOB'} are going to have loads of connections. Is there any kind of optimisation I can use here to allow for fast "JSON-style" chaining of Keys?
I feel like in my system, if I have the Entity node (fast - indexed on name), and the Property node (fast - indexed on name), then I should be able to get it's value without looping...? Am I misunderstanding how traversal works?
General tips or pointers to helpful material appreciated! Thanks.

2 REPLIES 2

I will have to double check tomorrow when but I think your graph doesn't have at all the structure to solve the questions you are trying to solve.

You can check the modelisation course from the neo4j academy it's awsome to understand what you are looking for.