Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-14-2020 05:55 AM
I am new to Cypher and work for a non-profit looking into financial crime. Most of our graph contains persons and entities and I want to check for duplicates. I tried the following simple query, but it returned everything
MATCH (a), (b)
WHERE a.name = b.name
How do I match for nodes with the exact same name property? How do I match for nodes where one name is contained in the other? For example one node is Mike Green and the other Mike Green Smith (Mike Green is contained completely in Mike Green Smith).
I really appreciate any advise you have! Just getting started and learning my way around.
Solved! Go to Solution.
08-17-2020 09:18 AM
This query work with all Neo4j version but it requires APOC:
MATCH (a)
WITH a.name AS name
CALL apoc.cypher.run('
MATCH (b)
WHERE name =~ "(?i)" + b.name
WITH collect(b) AS nodes
CALL db.index.fulltext.queryNodes("node_name", name) YIELD node
RETURN name, collect(node) + nodes AS nodes
', {name:name})
YIELD value
RETURN
Regards,
Cobra
08-14-2020 12:32 PM
Hello @mkretsch and welcome to the Neo4j community
We will need an index on name
for a quick search, I suppose you are using Person
and Entity
as node labels:
CALL db.index.fulltext.createNodeIndex("node_name", ["Person", "Entity"], ["name"])
This request will collect duplicates, thanks to a subquery, nodes for each name, the nodes which have the same name or the nodes which have a similar name:
MATCH (a)
CALL {
WITH a.name AS name
MATCH (b)
WHERE name =~ '(?i)' + b.name
WITH collect(b) AS nodes
CALL db.index.fulltext.queryNodes("node_name", name) YIELD node
RETURN name, collect(node) + nodes AS nodes
}
RETURN DISTINCT name, nodes
Regards,
Cobra
08-17-2020 06:25 AM
Thank you Cobra for your help. Unfortunately I am getting an error when I run this in Neo4j. It appears unhappy with the curly brackets and how the "name" was defined. Any thoughts on how to avoid these errors?
08-17-2020 06:27 AM
Can you show the error?
08-17-2020 06:39 AM
Invalid input '{': expected whitespace, comment, namespace of a procedure or a procedure name (line 3, column 6 (offset: 99))
"CALL {"
^
08-17-2020 06:50 AM
MATCH (a)
CALL {
WITH a
MATCH (b)
WHERE a.name =~ '(?i)' + b.name
WITH collect(b) AS nodes
CALL db.index.fulltext.queryNodes("node_name", a.name) YIELD node
RETURN a.name, collect(node) + nodes AS nodes
}
RETURN DISTINCT name, nodes
08-17-2020 07:23 AM
Still getting an error, any suggestions?
Invalid input '{': expected whitespace, comment, namespace of a procedure or a procedure name (line 2, column 6 (offset: 15))
"CALL {WITH a"
^
08-17-2020 07:24 AM
Which version of Neo4j are you using?
08-17-2020 09:05 AM
That's why it's not working, this query only works on versions of Neo4j > 4.1
Can you upgrade or do you want another query for your current version?
08-17-2020 09:18 AM
This query work with all Neo4j version but it requires APOC:
MATCH (a)
WITH a.name AS name
CALL apoc.cypher.run('
MATCH (b)
WHERE name =~ "(?i)" + b.name
WITH collect(b) AS nodes
CALL db.index.fulltext.queryNodes("node_name", name) YIELD node
RETURN name, collect(node) + nodes AS nodes
', {name:name})
YIELD value
RETURN
Regards,
Cobra
08-17-2020 11:29 AM
Thank you, that seems to work now! I really appreciate you reaching out to help.
08-17-2020 12:19 PM
No problem, I'm happy to hear it
All the sessions of the conference are now available online