Neo4j

utsavnepal7 · ‎11-10-2020

I Needed to create a dashboard where I need to use 8 queries to filter out the data for different attributes. It's taking more than 5 seconds to load all the data. Is there a way so that I can decrease the query time?

clem · ‎11-10-2020

Example query please.

Do you have indexes on the properties that the queries are using?

see: https://neo4j.com/docs/cypher-manual/current/administration/indexes-for-search-performance/

utsavnepal7 · ‎11-10-2020

Thank you Clem for the response.
I have almost 10 queries like this. I don't have index on the properties but I have it on the node.

Match (c:CustomerName)-[:initiates_PO]->(n:SalesDoc)<-[:associated_to]-(p:Plant)
MATCH (h:DeliveryBlock)<-[:has_deliveryBlock]-(n:SalesDoc)-[:comprises]->(s:Shipment)
WHERE s.ShipDate IS NOT NULL AND c.CustomerName = "AMAZON.COM" WITH DISTINCT(n) AS Order, h, s, apoc.date.parse(s.ShipDate, 's',"yyyy-MM-dd") AS shippedS, apoc.date.parse("2020-08-31", 's',"yyyy-MM-dd" ) AS todayS WITH h, Order, shippedS, todayS,collect(Order.SalesDoc) AS orders, apoc.date.add(todayS, 's', 2, 'd') as twoDaysFromNow WHERE  shippedS = twoDaysFromNow AND h.DeliveryBlock IS NOT NULL return COUNT(orders), h.DeliveryBlock ORDER BY COUNT(orders) LIMIT 10

clem · ‎11-10-2020

Some further thoughts:

You should have an index on CustomerName.CustomerName (this is a little confusing, as you have a Node Type (label) the same as a property name.)

I think you should also parse the date and store the parsed version as a property of Shipment. Then index that parsed date. As it is, you are recalculating something that should be only calculated once (during node creation.)

I'm new to Neo4J, but I'm a bit concerned about:
apoc.date.add(todayS, 's', 2, 'd') as twoDaysFromNow
because I think it might mean that twoDaysFromNow is recalculated many times instead of just once. I think it might be possible to calculate this once as a variable and reuse it, but also I'm not up on my Neo4J to know how to do that.

Your query is a bit hard to read, as it is one long line. I would break it up so it's easier to read. You might get more help that way! See: https://neo4j.com/developer/cypher/style-guide/

ameyasoft · ‎11-15-2020

apoc.date.parse return time in units selected in this function. 

 return apoc.date.parse("2020-08-31", 's',"yyyy-MM-dd" ) returns 1598832000 seconds and apoc.date.add(todayS , 's', 2, 'd') gives 1599004800 seconds. As you see, these are not dates.

Try this:

Match (c:CustomerName)-[:initiates_PO]->(n:SalesDoc)
where c.CustomerName = "AMAZON.COM" 

//setting todayS.........
with c, n, apoc.date.format(apoc.date.parse("2020-08-31", 's',"yyyy-MM-dd"), 's', 'yyyy-MM-dd') as todayS

//setting twoDaysFromNow......
with n, c, date(todayS) + duration({days: 2}) as twoDaysFromNow 

match (p:Plant)-[:associated_to]->(n)-[:comprises]->(s:Shipment)
WHERE s.ShipDate  is not null

//converting s.ShipDate to date.....shippedS
with n, c, twoDaysFromNow, p, s, apoc.date.format(apoc.date.parse(s.ShipDate, 's',"yyyy-MM-dd"), 's', 'yyyy-MM-dd') as shippedS

match (n)-[:has_deliveryBlock]->(h:DeliveryBlock)
where h.DeliveryBlock IS NOT NULL 

//adding the where clause.........
with n, h, s, shippedS where date(shippedS) = date(twoDaysFromNow)

//final result..........
with h.DeliveryBlock as delB, count(distinct n.SalesDoc) as Cnt

return delB, Cnt limit 10

Hope this helps!

clem · ‎11-13-2020

Have you tried PROFILE? I haven't tried it but it looks like it could help in your case. May be it would give you a clue as to what's slow.

utsavnepal7 · ‎11-20-2020

I guess the problem is not with the query itself. I used the profile to see the query execution plan. I also created indices in the attributes for search performance. I needed to get 15 graphs using 15 different queries and if each query takes 200-300 ms, it's taking like 5sec to fetch all the required data in my dashboard. May be the flaw is in the architecture. If I could do parallel execution of the queries and get the data at once, it would have been faster.

terryfranklin82 · ‎11-15-2020

While it's not a specific solution to your problem, I highly recommend taking the new Cypher Query Tuning in Neo4j 4.0 course available for free in the GraphAcademy.

It covers a lot of extremely useful skills for building efficient queries, such as optimal anchor selection, early aggregation, index hints, interpreting execution plans and more. It will give you a much better understanding of how to write fast queries.