cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Accessing NEAREST node

wadael
Node Clone

Hello

For a given label, with an index on a Point property, I am looking for the (geographically speaking) NEAREST node.
I have a precise geopos linked to an item and I want to link to another geopos with another relation.

My first attempts do not seem to use the spatial INDEX.
I get a message about index not being used as I do a calculation on lat/lon of the points (was trying product of absolute values for lon2-lon1 and lat2-lat1).
So its a fullLabelScan that happens.

I thought maybe, I could do a java function to return the nearest node to a node so I looked at the api thinking there could be a way to do a getIndex("indexname").fetch(node).
then have a getPrevious, getNext even getClosest on a spatial index but I cant see that in the
sources neo4j/community/spatial-index/src/main/java/org/neo4j/gis/spatial/index at 4.2 · neo4j/neo4j · GitHu...

How do you guys get the nearest node while maintaining performance ?
thank you

1 ACCEPTED SOLUTION

RETURN l2, distance(l1,l2) AS dist

I'm not quite sure that works. You're applying the distance function on two node variables, when it should only be run on two point types. If there was a location property, then it should be:

RETURN l2, distance(l1.location, l2.location) AS dist

We should be able to get better performance with an index on the point properties of the nodes. That can allow us to do an index-backed spatial distance search

However it does require providing some kind of acceptable default distance for which to perform the search (distance is in meters, when points are on a WGS-84 coordinate system). That way the spatial distance search can use the index to find nodes with locations within the given distance radius, and then your ordering and limiting will only apply to that smaller result set.

CYPHER runtime=slotted
MATCH (f:Thing)-[:LOCATED]->(l1:Place)
CALL {
  WITH l1
  WITH l1, l1.location as here
  MATCH (l2:Place)
  WHERE distance(here, l2.location) < 20000 // 20km search radius
  RETURN l2
  ORDER BY distance(l1.location, l2.location) ASC 
  LIMIT 1
}
WITH f, l1, l2
MERGE (f)-[:NEARBY]->(l2)

View solution in original post

4 REPLIES 4

wadael
Node Clone
CYPHER runtime=slotted
MATCH (f:Thing)-[:LOCATED]->(l1:Place)
CALL {
  WITH f,l1
  MATCH (l2:Place)
  RETURN l2, distance(l1,l2) AS dist
  ORDER BY dist ASC LIMIT 1
}
WITH f,l1,l2
MERGE (f)-[:NEARBY]->(l2)

subquery does the trick

46s for 1.2k things.
Acceptable for a run once.

RETURN l2, distance(l1,l2) AS dist

I'm not quite sure that works. You're applying the distance function on two node variables, when it should only be run on two point types. If there was a location property, then it should be:

RETURN l2, distance(l1.location, l2.location) AS dist

We should be able to get better performance with an index on the point properties of the nodes. That can allow us to do an index-backed spatial distance search

However it does require providing some kind of acceptable default distance for which to perform the search (distance is in meters, when points are on a WGS-84 coordinate system). That way the spatial distance search can use the index to find nodes with locations within the given distance radius, and then your ordering and limiting will only apply to that smaller result set.

CYPHER runtime=slotted
MATCH (f:Thing)-[:LOCATED]->(l1:Place)
CALL {
  WITH l1
  WITH l1, l1.location as here
  MATCH (l2:Place)
  WHERE distance(here, l2.location) < 20000 // 20km search radius
  RETURN l2
  ORDER BY distance(l1.location, l2.location) ASC 
  LIMIT 1
}
WITH f, l1, l2
MERGE (f)-[:NEARBY]->(l2)

wadael
Node Clone

Thank you Andrew

I presume the property name was lost in translation and it worked because I also have lat+ lon properties in addition to the point.

FWIW the time drops to 3s when doing a WHERE clause like yours.

I am puzzled by the double WITH in the CALL though.

So this is due to a technical limitation in the first WITH clause in a subquery. This tells the subquery what's in scope, but currently it's limited to only allowing variables without any kind of projection or aliasing. So using

...
CALL {
  WITH l1, l1.location as here
  ...

would error out. That forces us to use one WITH for the import of variables into scope, then a second one for the projection.