cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to return the value generated inside where clause in Cypher?

steves
Node Clone

I have the following Cypher query:

MATCH (n)-[r]->(k) WHERE ANY(x in keys(n) WHERE round(apoc.text.levenshteinSimilarity( TRIM( REDUCE(mergedString = "", item in n[x] | mergedString + item + " ")), "syn"), 4) > 0.8) RETURN n, r, k

How can I return the score generated inside the WHERE clause by the similarity function. I am trying to do this with WITH, without luck:

`MATCH (n)-[r]->(k) WITH COLLECT(x in keys(n) WHERE round(apoc.text.levenshteinSimilarity(TRIM(REDUCE(mergedString = '', item in n[x] | mergedString + item + ' ')), 'syn'), 4)) AS Score WHERE Score > 0.8 RETURN n, r, k, Score

1 ACCEPTED SOLUTION

You alternative solution using a list of pairs is good. It is more compact then what I proposed with the unwind. You can remove the need to check the condition twice with the following refactor of your solution. Note, you could combine lines 2 and 3, but it would be much harder to understand and does not provide any benefit.

MATCH (n)-[r]->(k)
WITH n, r, k, [x in keys(n) | [x, round(apoc.text.jaroWinklerDistance(TRIM(REDUCE(mergedString = '', item in n[x] | mergedString + item + ' ')), 'syn'), 4)]] as scores
WITH n, r, k, [s in scores WHERE s[1] >= 0.8] as n_attr_scores
WHERE size(n_attr_scores) > 0
RETURN n, r, k, n_attr_scores

View solution in original post

4 REPLIES 4

Try this. I removed the reduce operation so the solution is easier to understand.

MATCH (n)-[r]->(k)
UNWIND keys(n) as key
WITH n, r, k, key, round(apoc.text.levenshteinSimilarity(TRIM(n[key]), "syn"), 4) as score
WITH n, r, k, collect({key:key, value:n[key], score:score}) as keyScores
WHERE any(x in keyScores where x.score > 0.8)
RETURN *

The keyScores list is returned for each combination of n, r, and k that satisfy your constraint. The keyScores is a collection of maps, where each map contains the key, value, and the score for each key/value pair from 'n'.

The following is the solution with the 'reduce' put back in. I assume this is included to reduce properties that are lists of string to a single string.

MATCH (n)-[r]->(k)
UNWIND keys(n) as key
WITH n, r, k, key, round(apoc.text.levenshteinSimilarity(TRIM(REDUCE(mergedString = "", item in n[key] | mergedString + item + " ")), "syn"), 4) as score
WITH n, r, k, collect({key:key, value:n[key], score:score}) as keyScores
WHERE any(x in keyScores where x.score > 0.8)
RETURN *

steves
Node Clone

@glilienfield Thanks a lot for your solution, please check out mine, what do you think? What are the drawbacks in your opinion:

MATCH (n)-[r]->(k)
WITH *, [x in keys(n) | [x, round(apoc.text.jaroWinklerDistance(TRIM(REDUCE(mergedString = '', item in n[x] | mergedString + item + ' ')), 'syn'), 4)]] as scores
WHERE ANY(s in scores WHERE s[1] >= 0.8)
RETURN n, r, k, [s in scores WHERE s[1] >= 0.8] as n_attr_scores

steves
Node Clone

How can I avoid the double check with the WHERE clause to filter the relevant > 0.8 blocks? @glilienfield

MATCH (n)-[r]->(k) UNWIND keys(n) as key WITH n, r, k, key, round(apoc.text. levenshteinSimilarity(TRIM(REDUCE(mergedString = "", item in n[key] | mergedString + item + " ")), "syn"), 4) as score WITH n, r, k, collect({key:key, value:n[key], score:score}) as keyScores WHERE ANY(s in keyScores WHERE s.score >= 0.8) RETURN n, r, k, [s in keyScores WHERE s.score >= 0.8 | s] AS attr_scores

You alternative solution using a list of pairs is good. It is more compact then what I proposed with the unwind. You can remove the need to check the condition twice with the following refactor of your solution. Note, you could combine lines 2 and 3, but it would be much harder to understand and does not provide any benefit.

MATCH (n)-[r]->(k)
WITH n, r, k, [x in keys(n) | [x, round(apoc.text.jaroWinklerDistance(TRIM(REDUCE(mergedString = '', item in n[x] | mergedString + item + ' ')), 'syn'), 4)]] as scores
WITH n, r, k, [s in scores WHERE s[1] >= 0.8] as n_attr_scores
WHERE size(n_attr_scores) > 0
RETURN n, r, k, n_attr_scores