Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
04-28-2019 04:11 AM
Hi, guys!
I have below nodes and relationships in my graph:
(c:Contact {tel: properties.tel})
(r:Role {roleId: properties.roleId})
(c:Contact)-[ri:ROLE_IS]->(r:Role)
(c1:Contact)-[htc:HAS_TEL_CONTACT]->(c2:Contact)
There are about 0.1 billion nodes and 0.2 billion relationships in the graph.
I want to find out the contacts of contacts' roles, which means the depth of HAS_TEL_CONTACT relationship is 2. Then I write below CQL:.
MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT]->(:Contact)-[:HAS_TEL_CONTACT]->(:Contact)-[:ROLE_IS]->(r:Role)
WITH DISTINCT r
RETURN r.roleId;
But I find the query is too slow to me. I profile the query, the result is below:
It takes about 2 mins to finish, which is much slower than using MySQL to achieve the same function.
My neo4j is community version v3.5.4 . The CPU is of 8 cores. The initial and max heap size is 16G.
Can I do any improvement on the CQL or neo4j configuration to make the query faster?
Thanks!
04-28-2019 11:00 AM
I'm also very beginner to CQL. But how about to query like?
MATCH(c1:Contact)
MATCH(c2:Contact)
MATCH(c3:Contact)
MATCH(r:Role)
MERGE(c1)-[htc:HAS_TEL_CONTACT]->(c2)-[htc:HAS_TEL_CONTACT]->(c3)-[:ROLE_IS]->(r)
WITH DISTINCT r
RETURN r.roleId
However, I'm not pretty sure if the syntax I have written is fine. I hope, you'll find a hint. And also let me know if this works for you.
Thanks,
Bhojendra
04-28-2019 01:55 PM
Try this query:
MATCH (cr:Contact)-[:ROLE_IS]->(r:Role)
WITH COLLECT(r) as r1, COLLECT(cr) as cr1
UNWIND r1 as r2
UNWIND cr1 as cr2
MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT]->(:Contact)-[:HAS_TEL_CONTACT]->(cr2)-[:ROLE_IS]->(r2)
WITH DISTINCT r2
RETURN r2.roleid;
04-29-2019 01:09 AM
Thanks for your reply! The query is also very slow, even slower than mine. It seems the first MATCH takes too much time.
04-29-2019 07:32 AM
Let's see the count of distinct contacts vs non-distinct. Try running each of these, noting both the number of results and the time taken:
MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT*2]->(c:Contact)
RETURN count(c) as count
and
MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT*2]->(c:Contact)
RETURN count(DISTINCT c) as distinctCount
04-29-2019 06:45 PM
I pick a tel in my graph randomly.
The first 'without distinct' query takes 3233 ms. The count is 46156.
The second 'with distinct' query takes 144 ms, which is very fast. While, I think it is caused by the cache is hit. The count is 35296.
04-29-2019 12:19 AM
It not works. And I don't need MERGE. I just need to MATCH.
04-29-2019 08:06 PM
Did you create index on the node using that property? This will your query faster.
I suggest to use explain to understand how many nodes are being computed.
04-29-2019 10:38 PM
Yes, I have create index on "tel" property and "roleId" property. You can see the computed node number in the profile result pic .
All the sessions of the conference are now available online