cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Optimizing a query against GTFS data

I have a working query to help me find transit routes. It seems to take a long time to run, and the plan looks a bit asymmetrical to me, but I'm hardly an expert. Any ideas on how to optimize this?

Here's my query:

PROFILE
MATCH (origin:Stop {id: "4009"})--(origin_st:Stoptime)
WHERE origin_st.departure_time > "08:00:00"
AND origin_st.departure_time < "09:00:00"

WITH origin, origin_st
MATCH (destination:Stop {id:"2625"})--(destination_st:Stoptime)
WHERE destination_st.arrival_time < "10:00:00"
AND destination_st.arrival_time > origin_st.departure_time
 
WITH origin, origin_st, destination, destination_st
MATCH p = allshortestpaths((origin_st)-[*]->(destination_st))

WITH nodes(p) AS nodes, origin, destination, origin_st, destination_st
UNWIND nodes AS n
MATCH (n)-[r:PART_OF_TRIP]-(trip:Trip)-[]-(sd:ServiceDate)
WHERE sd.date = datetime({epochMillis:apoc.date.parse('20200624','ms', 'yyyyMMdd')})

WITH DISTINCT trip, origin, origin_st, destination, destination_st
MATCH (trip)-[:USES]->(route)
RETURN DISTINCT trip, route, origin, origin_st, destination, destination_st

Here's the plan:

Neo4j version, using Bolt:

2020-07-13 16:30:59.289+0000 INFO  ======== Neo4j 4.0.6 ========
2020-07-13 16:30:59.295+0000 INFO  Starting...
2020-07-13 16:31:10.334+0000 INFO  Called db.clearQueryCaches(): Query cache already empty.
2020-07-13 16:31:10.410+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2020-07-13 16:31:10.410+0000 INFO  Started.
2020-07-13 16:31:11.188+0000 INFO  Remote interface available at http://localhost:7474/
0 REPLIES 0