cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Data Model for Relationships in Time

I am relatively new to data modelling using Neo4j and I'd like to get some suggestions on how to model relationships in time.
Let's say that John went to ACME school from 01/Aug/1983 to 30/Sep/1988.
Peter also went to ACME school from 01/Jan/1985 to 30/Oct/1990
Mary went to ACME school from 01/Oct/1988 to 30/Oct/1993
In this simple example, Peter shares time with both John and Mary at that school. But Mary doesn't share time with John.
I need to query hundreds of thousands (maybe a few million) of those records looking for overlapping periods between students with a minimum overlap of 180 days.
What would be the best way to model this data in order to have decent performance at query time?
Thanks!

1 REPLY 1

Hi @flavio

This case is very interesting.

I added test data to the 3 data.

CREATE (acme:School {name:"ACME"})
CREATE (:Student {name:"John"})-[:WENT_TO {startDate:date("1983-08-01"), endDate:date("1988-09-30")}]->(acme)
CREATE (:Student {name:"Peter"})-[:WENT_TO {startDate:date("1985-01-01"), endDate:date("1990-10-30")}]->(acme)
CREATE (:Student {name:"Mary"})-[:WENT_TO {startDate:date("1988-10-01"), endDate:date("1993-10-30")}]->(acme)
// = 180
CREATE (:Student {name:"Student1"})-[:WENT_TO {startDate:date("1980-01-01"), endDate:date("1984-01-28")}]->(acme)
CREATE (:Student {name:"Student2"})-[:WENT_TO {startDate:date("1984-02-01"), endDate:date("1984-07-30")}]->(acme)
CREATE (:Student {name:"Student3"})-[:WENT_TO {startDate:date("1988-04-03"), endDate:date("1993-01-01")}]->(acme)
// < 180
CREATE (:Student {name:"Student4"})-[:WENT_TO {startDate:date("1980-01-01"), endDate:date("1984-01-27")}]->(acme)
CREATE (:Student {name:"Student5"})-[:WENT_TO {startDate:date("1984-02-01"), endDate:date("1984-06-30")}]->(acme)
CREATE (:Student {name:"Student6"})-[:WENT_TO {startDate:date("1988-04-04"), endDate:date("1993-01-01")}]->(acme)

This is the Cypher code for search.
Since there are only a few records, there is no index. If the number of records increases, it is better to add indexes to startDate and endDate.

WITH 180 AS overlap
MATCH (john:Student {name:"John"})-[johnDate:WENT_TO]->(school:School),
      (someone:Student)-[someoneDate:WENT_TO]->(school)
WHERE someoneDate.endDate > johnDate.startDate
  AND someoneDate.startDate < johnDate.endDate
  AND duration.inDays(johnDate.startDate, someoneDate.endDate).days >= overlap
  AND duration.inDays(someoneDate.startDate, johnDate.endDate).days >= overlap
  AND duration.inDays(someoneDate.startDate, someoneDate.endDate).days >= overlap
RETURN someone

John, Student 1, Student 2, and Student 3 were selected.