Neo4j

xanadu860122 · ‎04-28-2019

Hi, guys!

I have below nodes and relationships in my graph:

(c:Contact {tel: properties.tel})
(r:Role {roleId: properties.roleId})
(c:Contact)-[ri:ROLE_IS]->(r:Role)
(c1:Contact)-[htc:HAS_TEL_CONTACT]->(c2:Contact)

There are about 0.1 billion nodes and 0.2 billion relationships in the graph.

I want to find out the contacts of contacts' roles, which means the depth of HAS_TEL_CONTACT relationship is 2. Then I write below CQL:.

MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT]->(:Contact)-[:HAS_TEL_CONTACT]->(:Contact)-[:ROLE_IS]->(r:Role)
WITH DISTINCT r
RETURN r.roleId;

But I find the query is too slow to me. I profile the query, the result is below:

It takes about 2 mins to finish, which is much slower than using MySQL to achieve the same function.
My neo4j is community version v3.5.4 . The CPU is of 8 cores. The initial and max heap size is 16G.

Can I do any improvement on the CQL or neo4j configuration to make the query faster?

Thanks!

ri8ika · ‎04-28-2019

I'm also very beginner to CQL. But how about to query like?

MATCH(c1:Contact)
MATCH(c2:Contact)
MATCH(c3:Contact)
MATCH(r:Role)
MERGE(c1)-[htc:HAS_TEL_CONTACT]->(c2)-[htc:HAS_TEL_CONTACT]->(c3)-[:ROLE_IS]->(r)
WITH DISTINCT r
RETURN r.roleId

However, I'm not pretty sure if the syntax I have written is fine. I hope, you'll find a hint. And also let me know if this works for you.

Thanks,
Bhojendra

ameyasoft · ‎04-28-2019

Try this query:

MATCH (cr:Contact)-[:ROLE_IS]->(r:Role)
WITH COLLECT(r) as r1, COLLECT(cr) as cr1
UNWIND r1 as r2
UNWIND cr1 as cr2

MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT]->(:Contact)-[:HAS_TEL_CONTACT]->(cr2)-[:ROLE_IS]->(r2)
WITH DISTINCT r2
RETURN r2.roleid;

xanadu860122 · ‎04-29-2019

Thanks for your reply! The query is also very slow, even slower than mine. It seems the first MATCH takes too much time.

andrew_bowman · ‎04-29-2019

Let's see the count of distinct contacts vs non-distinct. Try running each of these, noting both the number of results and the time taken:

MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT*2]->(c:Contact)
RETURN count(c) as count

and

MATCH (:Contact {tel: 'xxxx'})-[:HAS_TEL_CONTACT*2]->(c:Contact)
RETURN count(DISTINCT c) as distinctCount

xanadu860122 · ‎04-29-2019

I pick a tel in my graph randomly.
The first 'without distinct' query takes 3233 ms. The count is 46156.
The second 'with distinct' query takes 144 ms, which is very fast. While, I think it is caused by the cache is hit. The count is 35296.

xanadu860122 · ‎04-29-2019

It not works. And I don't need MERGE. I just need to MATCH.

dariusaudryc1 · ‎04-29-2019

Did you create index on the node using that property? This will your query faster.

I suggest to use explain to understand how many nodes are being computed.

xanadu860122 · ‎04-29-2019

Yes, I have create index on "tel" property and "roleId" property. You can see the computed node number in the profile result pic .

Neo4j

Query slow when relationship depth is 2