Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-05-2020 01:14 PM
I am using Neo4j 3.5.17 Enterprise with GDS 1.2.
I am specifically trying to create a query that will take the top n nodes (by PageRank) and compute the Euclidean distance to each of those nodes for every node within 2 hops (ego radius=2) of those nodes. For example, suppose I have nodes A, B, and C as the 3 nodes with the highest PageRank. So then I want to get all nodes that are 2 hopes individually from each of those nodes. So that might be a set of nodes something like
Node A: D, E, F
Node B: G, H
Node C: I, J, K, L
So I want to loop through nodes A-C, find their respective nodes in the ego graph, and set the Euclidean distance on nodes D-L based on their relationship to their parent node (A-C). So I might get some result:
Node D has a Euclidean distance of 100.0 from Node A
...
Node G has a Euclidean distance of 200.0 from Node B
...
etc.
I have managed to make this work for single nodes, such as providing node A explicitly using:
MATCH (r1:NodeLabel)-[*..2]-(r2:NodeLabel {nodeName: 'A'})
SET r1.distance = gds.alpha.similarity.euclideanDistance(r1.myVector, r2:myVector)
RETURN DISTINCT r1.nodeName, r1.pagerank, r2.nodeName
ORDER BY r1.distance
However, I would like to be able to loop this over several values of r2:nodeName
. To do this, I have tried the following:
MATCH (r1:nodeLabel) WHERE r1.pagerank > 40.
MATCH (r2:nodeLabel)-[*..2]-r1
SET r2.distance = gds.alpha.similarity.euclideanDistance(r1.myVector, r2.myVector)
RETURN DISTINCT r1.nodeName, r1.pagerank, r2.nodeName
ORDER BY r2.distance
however I get the following error:
Invalid input '(': expected whitespace, comment, '.', node labels, '[', "=~", IN, STARTS, ENDS, CONTAINS, IS, '^', '*', '/', '%', '+', '-', '=', '~', "<>", "!=", '<', '>', "<=", ">=", AND, XOR, OR, FROM GRAPH, CONSTRUCT, LOAD CSV, START, MATCH, UNWIND, MERGE, CREATE UNIQUE, CREATE, SET, DELETE, REMOVE, FOREACH, WITH, CALL, RETURN, UNION, ';' or end of input (line 2, column 7 (offset: 56))
"MATCH (r2:nodeLabel)-[*..2]-r1"
Any suggestions? Thanks in advance!
Solved! Go to Solution.
06-08-2020 09:03 AM
Ok
Have a look at apoc.cypher.doIt()
: doc
Put this part in it:
MATCH (r2:nodeLabel)-[*..2]-(r1)
SET r2.distance = gds.alpha.similarity.euclideanDistance(r1.myVector, r2.myVector)
RETURN DISTINCT r1.nodeName, r1.pagerank, r2.nodeName
Moreover, should not it be [*0..2]
?
Regards,
Cobra
06-06-2020 03:27 AM
Hello @cj2001
There is a syntax error on your request, you forget to put r1
between ()
:
MATCH (r1:nodeLabel) WHERE r1.pagerank > 40.
MATCH (r2:nodeLabel)-[*..2]-(r1)
SET r2.distance = gds.alpha.similarity.euclideanDistance(r1.myVector, r2.myVector)
RETURN DISTINCT r1.nodeName, r1.pagerank, r2.nodeName
ORDER BY r2.distance
Moreover, you can have a look have the ORDER BY
clause if you want the top n nodes (by PageRank).
Regards,
Cobra
06-08-2020 08:59 AM
Oh, yes. That was silly on my part. The typo is in my transcription of going from my very specific query on my system to making it generalized for this post. The ()
's are actually in my query as you have written it above, and I am still getting the original error.
06-08-2020 09:03 AM
Ok
Have a look at apoc.cypher.doIt()
: doc
Put this part in it:
MATCH (r2:nodeLabel)-[*..2]-(r1)
SET r2.distance = gds.alpha.similarity.euclideanDistance(r1.myVector, r2.myVector)
RETURN DISTINCT r1.nodeName, r1.pagerank, r2.nodeName
Moreover, should not it be [*0..2]
?
Regards,
Cobra
06-08-2020 09:19 AM
OK, this worked, but not for the reasons we thought.
It turns out that what it was objecting to was WHERE r1.pagerank > 40.
. In particularly, it didn't like that this line ended with a .
. Once I replaced 40.
to 40.0
, it worked.
Thank you for your help!
06-08-2020 09:21 AM
Oh I'm happy to hear this
No problem
Regards,
Cobra
All the sessions of the conference are now available online