Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-10-2020 11:17 PM
Hello Folks,
I'm looking for some assistance with one of my live graphs. My nodes and relationships looks something like this
This is an example of one end to end relationship within the graph.
The issue is related to node (Store) having multiple (Notifications) and they being related to multiple (Notification Action Types) & (Notification Types)
The query looks something like this.
MATCH (s:app)-[:HAS_GEOZONE]->(g:geozone)-[:HAS_ENGAGEMENTZONE]->(e:engagement_zone)-[:HAS_STORE]->(st:store)-[:HAS_NOTIFICATION]->(nt:notification), shortestPath((nt)-[r:HAS_NOTIFICATION_TYPE]->(ntt:notification_type)), shortestPath((nt)-[ra:HAS_NOTIFICATION_ACTION_TYPE]->(ntta:notification_action_type)), (st)-[:HAS_ATTRIBUTE]->(sta:store_attribute) WHERE s.uuid={app_id} AND sta.key='name' OPTIONAL MATCH (nt)-[:HAS_BRAND]->(br:brand) OPTIONAL MATCH (nt)-[:HAS_LABEL]->(l:loyalty) RETURN nt.uuid as nt_id, COLLECT(DISTINCT st.uuid) as st_ids, COLLECT(DISTINCT sta.value) as store_names, COLLECT(DISTINCT properties(br)) as notification_brands, COLLECT(DISTINCT properties(l)) as notification_labels, COLLECT(DISTINCT properties(nt)) as notification, COLLECT(DISTINCT properties(ntt)) as notification_type, COLLECT(DISTINCT properties(ntta)) as notification_action_type ORDER BY nt_id
And the response times for one query is more than 8 secs. And my application ends up requiring this information fairly frequently. Which was causing overall poor response and crashes which is why in the interim I have introduced redis in between to cache some of this data required by the app.
Please provide the following information if you ran into a more serious issue:
neo4j version: neo4j:3.5.12-enterprise and running in Docker on AWS m5.xlarge machine with 12G heap and cache size configured
what kind of API / driver do you use: Rest API on ECS and Node JS on separate instances
screenshot of [PROFILE
or EXPLAIN
]
which plugins / extensions / procedures do you use: neo4j spatial plugin
Also, attaching query.log which explains the execution time lines on live environment.
Any assistance on this is much appreciated!
Thanks,
Arnab
01-11-2020 04:16 AM
What exactly are you trying to accomplish with your query? It would help to know what information you're trying to get from the graph. I don't want to assume but it looks like you're finding stores, their locations, and then directions? But it's hard to tell.
01-11-2020 06:02 AM
Thanks for the reply. Sorry if it wasn't clear. But the query is trying find the related notification nodes associated to the stores where the profiling query looks like:
PROFILE MATCH (s:app)-[:HAS_GEOZONE]->(g:geozone)-[:HAS_ENGAGEMENTZONE]->(e:engagement_zone)-[:HAS_STORE]->(st:store)-[:HAS_NOTIFICATION]->(nt:notification), shortestPath((nt)-[r:HAS_NOTIFICATION_TYPE]->(ntt:notification_type)), shortestPath((nt)-[ra:HAS_NOTIFICATION_ACTION_TYPE]->(ntta:notification_action_type)), (st)-[:HAS_ATTRIBUTE]->(sta:store_attribute) WHERE s.uuid='dd669910-e0d3-11e8-b431-1d83bfd113a3' AND sta.key='name' OPTIONAL MATCH (nt)-[:HAS_BRAND]->(br:brand) OPTIONAL MATCH (nt)-[:HAS_LABEL]->(l:loyalty) RETURN nt.uuid as nt_id, COLLECT(DISTINCT st.uuid) as st_ids, COLLECT(DISTINCT sta.value) as store_names, COLLECT(DISTINCT properties(br)) as notification_brands, COLLECT(DISTINCT properties(l)) as notification_labels, COLLECT(DISTINCT properties(nt)) as notification, COLLECT(DISTINCT properties(ntt)) as notification_type, COLLECT(DISTINCT properties(ntta)) as notification_action_type ORDER BY nt_id
And the e.g. of the response of one entry from the original query is
And, the json looks like below where the column names are the nodes in the graph
[
{
"nt_id": "002a3ba0-2584-11ea-93de-118eb121a0f8",
"st_ids": [
"e5fb2cc0-2246-11ea-a327-c1a6ac2ca4a0"
],
"store_names": [
"AND"
],
"notification_brands": [],
"notification_labels": [],
"notification": [
{
"sub_text": "Happy shopping!!",
"action_url": "https://www.tatacliq.com/and/c-mbh11a00015",
"image_url": "",
"notification_match_type": "GENERAL",
"validity_start": 1,
"text": "Welcome to {{store_name}}",
"inventory_request_params": "",
"isActive": true,
"validity_end": 1,
"uuid": "002a3ba0-2584-11ea-93de-118eb121a0f8",
"active_days": "{\"SUNDAY\":\"1100-2100\",\"MONDAY\":\"1100-2100\",\"TUESDAY\":\"1100-2100\",\"WEDNESDAY\":\"1100-2100\",\"THURSDAY\":\"1100-2100\",\"FRIDAY\":\"1100-2100\",\"SATURDAY\":\"1100-2100\"}"
}
],
"notification_type": [
{
"name": "Deals & Offers",
"uuid": "2fdc2b20-4faf-11e9-bfff-47192e190163"
}
],
"notification_action_type": [
{
"name": "In Store",
"uuid": "ce78fc50-4fae-11e9-b974-7995b4e2b93d"
}
]
}
]
01-11-2020 06:50 AM
I can tell you that the shortest path algorithm is pretty expensive. I’m not sure in your use case that it’s necessary. You could probably restructure your relationships a bit and make it work better. I’ll try to make some better suggestions when I get more time. But, I would start there.
01-13-2020 09:31 PM
Thanks for the answer. Well, I've managed to resolve this now. The reason why I had introduced shortestPath was I had this query earlier. Which is
MATCH (s:app)-[:HAS_GEOZONE]->(g:geozone)-[:HAS_ENGAGEMENTZONE]->(e:engagement_zone)-[:HAS_STORE]->(st:store)-[:HAS_NOTIFICATION]->(nt:notification), (nt)-[r:HAS_NOTIFICATION_TYPE]->(ntt:notification_type), (nt)-[ra:HAS_NOTIFICATION_ACTION_TYPE]->(ntta:notification_action_type), (st)-[:HAS_ATTRIBUTE]->(sta:store_attribute)
WHERE s.uuid={app_id} AND sta.key='name'
OPTIONAL MATCH (nt)-[:HAS_BRAND]->(br:brand)
OPTIONAL MATCH (nt)-[:HAS_LABEL]->(l:loyalty)
RETURN nt.uuid as nt_id, COLLECT(DISTINCT st.uuid) as st_ids, COLLECT(DISTINCT sta.value) as store_names, COLLECT(DISTINCT properties(br)) as notification_brands, COLLECT(DISTINCT properties(l)) as notification_labels, COLLECT(DISTINCT properties(nt)) as notification, COLLECT(DISTINCT properties(ntt)) as notification_type, COLLECT(DISTINCT properties(ntta)) as notification_action_type ORDER BY nt_id
And if in my graph there was an incoming relationship i.e. [r:HAS_NOTIFICATION_TYPE]
& [ra:HAS_NOTIFICATION_ACTION_TYPE]
for each Notification node, and there were multiple notification to each Store node. When the graph grew i.e. When I had more than 1500+ store and ~200 notifications under each and each of those had the above incoming relationships. The query would just not work and hang and crash the server. Changing those to shortestPath did stop that from happening but the results time was still high.
Now, I'm saving the Notif_Type & Notif_Action_Type nodes' uuid in the Notification node's properties. And updated the query like so:
MATCH (n:business_entity)-[:HAS_APP]->(s:app)-[:HAS_GEOZONE]->(g:geozone)-[:HAS_ENGAGEMENTZONE]->(e:engagement_zone)-[:HAS_STORE]->(st:store)-[:HAS_NOTIFICATION]->(nt:notification),
(st)-[:HAS_ATTRIBUTE]->(sta:store_attribute)
WHERE n.uuid={b_id} AND s.uuid={app_id} AND sta.key='name'
OPTIONAL MATCH (nt)-[:HAS_BRAND]->(br:brand)
OPTIONAL MATCH(ntt:notification_type) WHERE ntt.uuid = nt.notification_type
OPTIONAL MATCH(ntta:notification_action_type) WHERE ntta.uuid = nt.notification_action_type
RETURN nt.uuid as nt_id, COLLECT(DISTINCT st.uuid) as st_ids, COLLECT(DISTINCT sta.value) as store_names,
COLLECT(DISTINCT properties(br)) as notification_brands, COLLECT(DISTINCT properties(nt)) as notification,
COLLECT(DISTINCT properties(ntt)) as notification_type, COLLECT(DISTINCT properties(ntta)) as notification_action_type
And, the response times are under 500ms now. Thanks for the help! Cheers!
01-15-2020 07:11 AM
Glad to be able to help. Sometimes all it takes is a little different perspective. Happy you got it working.
All the sessions of the conference are now available online