Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-30-2021 04:04 AM
I am relatively new to data modelling using Neo4j and I'd like to get some suggestions on how to model relationships in time.
Let's say that John went to ACME school from 01/Aug/1983 to 30/Sep/1988.
Peter also went to ACME school from 01/Jan/1985 to 30/Oct/1990
Mary went to ACME school from 01/Oct/1988 to 30/Oct/1993
In this simple example, Peter shares time with both John and Mary at that school. But Mary doesn't share time with John.
I need to query hundreds of thousands (maybe a few million) of those records looking for overlapping periods between students with a minimum overlap of 180 days.
What would be the best way to model this data in order to have decent performance at query time?
Thanks!
12-31-2021 05:07 PM
Hi @flavio
This case is very interesting.
I added test data to the 3 data.
CREATE (acme:School {name:"ACME"})
CREATE (:Student {name:"John"})-[:WENT_TO {startDate:date("1983-08-01"), endDate:date("1988-09-30")}]->(acme)
CREATE (:Student {name:"Peter"})-[:WENT_TO {startDate:date("1985-01-01"), endDate:date("1990-10-30")}]->(acme)
CREATE (:Student {name:"Mary"})-[:WENT_TO {startDate:date("1988-10-01"), endDate:date("1993-10-30")}]->(acme)
// = 180
CREATE (:Student {name:"Student1"})-[:WENT_TO {startDate:date("1980-01-01"), endDate:date("1984-01-28")}]->(acme)
CREATE (:Student {name:"Student2"})-[:WENT_TO {startDate:date("1984-02-01"), endDate:date("1984-07-30")}]->(acme)
CREATE (:Student {name:"Student3"})-[:WENT_TO {startDate:date("1988-04-03"), endDate:date("1993-01-01")}]->(acme)
// < 180
CREATE (:Student {name:"Student4"})-[:WENT_TO {startDate:date("1980-01-01"), endDate:date("1984-01-27")}]->(acme)
CREATE (:Student {name:"Student5"})-[:WENT_TO {startDate:date("1984-02-01"), endDate:date("1984-06-30")}]->(acme)
CREATE (:Student {name:"Student6"})-[:WENT_TO {startDate:date("1988-04-04"), endDate:date("1993-01-01")}]->(acme)
This is the Cypher code for search.
Since there are only a few records, there is no index. If the number of records increases, it is better to add indexes to startDate and endDate.
WITH 180 AS overlap
MATCH (john:Student {name:"John"})-[johnDate:WENT_TO]->(school:School),
(someone:Student)-[someoneDate:WENT_TO]->(school)
WHERE someoneDate.endDate > johnDate.startDate
AND someoneDate.startDate < johnDate.endDate
AND duration.inDays(johnDate.startDate, someoneDate.endDate).days >= overlap
AND duration.inDays(someoneDate.startDate, johnDate.endDate).days >= overlap
AND duration.inDays(someoneDate.startDate, someoneDate.endDate).days >= overlap
RETURN someone
John, Student 1, Student 2, and Student 3 were selected.
All the sessions of the conference are now available online