Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-20-2021 06:05 AM
Hi, I am running the following query
MATCH (fn1: full_name)
MATCH (fn2: full_name)
WHERE fn1.full_name <> fn2.full_name and apoc.text.fuzzyMatch(fn1.full_name, fn2.full_name)=TRUE
MERGE (fn1)-[:FUZZY_MATCH]-(fn2)
which is currently taking more than an hour to run. The graph consists of approximately 54K full_name nodes. The idea is to create a connection between similar names.
Is there a way for me to optimize this process?
(Screenshot of query map for reference)
08-20-2021 06:49 AM
Hi, have you set indexes already?
08-20-2021 07:01 AM
Even index creation is taking time but I'm currently creating indexes for full_name nodes. Will update on performance once it gets done.
08-20-2021 07:03 AM
Also, you can check the constraint creation. I believe this can speed up even more (but at the cost of having an exclusive field)
08-20-2021 07:08 AM
Is there anything I'm missing out on in terms of query tuning? I was hoping that might be an area to explore as well.
All the sessions of the conference are now available online