cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to write cypher query to count number of nodes in graph based on levenshtein similarity

Hello everyone I need to write a cypher query for a below scenario.

Given a list of strings, count the number nodes in graph where levenshtein similarity between node name property and strings from the list is more than certain thershold.

I was able to write query if we only have 1 string but I am not sure how to write a query if we have multiple strings ['string 1', 'string 2', 'string 3'].

MATCH (n:Node)
UNWIND (n.name) as name_lst
RETURN SUM(toInteger(apoc.text.levenshteinSimilarity(name_lst, 'string 1') > 0.6))

Any thoughts on how to transform the above a query if we have multiple strings.

1 ACCEPTED SOLUTION

Then you have to use a predicate function:

MATCH (n:Node) 
WHERE any(string IN ['string 1', 'string 2', 'string 3'] WHERE apoc.text.levenshteinSimilarity(n.name, string) > 0.6) 
RETURN count(n) AS nb_nodes

View solution in original post

3 REPLIES 3

Hello @atinesh 😊

This query should work:

MATCH (n:Node) 
WHERE apoc.text.levenshteinSimilarity(n.name, 'string 1') > 0.6 
RETURN count(n) AS nb_nodes

Regards,
Cobra

Hello @Cobra but what if we have a list of strings to compare for ex. ['string 1', 'string 2', 'string 3']

Then you have to use a predicate function:

MATCH (n:Node) 
WHERE any(string IN ['string 1', 'string 2', 'string 3'] WHERE apoc.text.levenshteinSimilarity(n.name, string) > 0.6) 
RETURN count(n) AS nb_nodes