Neo4j

eelir · ‎12-16-2020

I have a graph of homogenous nodes (7K) and relationships (27K).

I need to randomly pick a node (n) and randomly pick four of its neighbours (a,b,c,d) and also do the same for those (so for a need to pick a1,a2,a3,a4). All these nodes must be distinct.

The cypher works fine until here (I have not worked out the random at this point I just want to return one valid result):

MATCH (n {term:'Car'})
MATCH (n)<--(a)<--(a1)
MATCH (n)<--(a)<--(a2)
MATCH (n)<--(a)<--(a3)
MATCH (n)<--(a)<--(a4)
MATCH (n)<--(b)<--(b1)
MATCH (n)<--(b)<--(b2)
MATCH (n)<--(b)<--(b3)
WHERE a<>b
AND a1<>a2 AND a1<>a3 AND a1<>a4 AND a2<>a3 AND a2<>a4 AND a3<>a4
AND b1<>b2 AND b1<>b3 AND b2<>b3
RETURN n,a,a1,a2,a3,a4,b,b1,b2,b3
LIMIT 1

As soon as I add

MATCH (n)<--(b)<--(b4)

The cypher runs forever until it just breaks my desktop neo4j totally. I increased the heap memory to 4GB. What am i doing wrong and is there a way to make it more elegantly and faster?

david_allen · ‎12-16-2020

I think you're working much too hard with this query and there's a simpler way to do it.

MATCH (n:Something { term: "Car" })<-[]-(a)<-[]-(b)
WHERE id(a) <> id(b)
WITH n, collect(a) as firstHops, collect(b) as secondHops
RETURN n, firstHops, secondHops

I think the reason your cypher is taking so long is two reasons:

You don't specify a node label on your initial n match. This means Neo4j has to check every node in the entire database, which is bad
You specify the same pattern many times, with conditions that the a's and b's can't match and so forth. You don't need to do that at all....just ask for a path of length 3 like I did, and then whatever matches to the first hop or the second hop will already be unique. The reason I added a <> condition is to make sure that the intermediate nodes never point back to themselves.

Neo4j

Cypher never resolves (takes to long and consumes much memory)