cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Duplicate Master Records issue

My current schema is as below where a "Customer_ERP_Auto" node (in brown) gets connected to other related "Customer_ERP_Auto" nodes via multiple relationships.

sethus_0-1674192253832.png

My Objective here is to create a new “Customer_Master” node for every such group / cluster of nodes connected using the "FUZZY_SIMILARITY" relationship. The newly created "Customer_Master" node has to link with all the nodes in the group using the “HAS_MASTER” relationship as shown below.

sethus_1-1674192393464.png

I tried creating the below Cypher query for this purpose.

call apoc.periodic.iterate(

"MATCH(a:Customer_ERP_Auto)-[r:FUZZY_SIMILARITY]->(b:Customer_ERP_Auto)

WHERE NOT exists((a)-[:HAS_MASTER]->())

return a,b",

"WITH a, collect(b.customer_id) as customer_ids

WITH a, [cust_id in customer_ids where NOT EXISTS((:Customer_ERP_Auto {customer_id:cust_id})-[:HAS_MASTER]->())] as f_customer_ids

UNWIND f_customer_ids as f_cust_id

MERGE(m:Customer_Master {master_id:a.customer_id})

MERGE(c:Customer_ERP_Auto {customer_id:f_cust_id})

MERGE(c)-[:HAS_MASTER]->(m)",

{parallel:False});

But the above query ends up creating duplicate Customer_Master records as shown below.

sethus_2-1674192426710.png

would need your help in fixing the above query such that there is only ONE “Customer_Master” for every Customer_ERP_Auto node in the DB.

Looking forward to your valuable inputs.

Thanks in advance. 

 

 

 

2 REPLIES 2

The first line in your first query is going to result in multiple rows with different values for 'a'. Assuming there are no has_master relationships yet, you will then get many rows with multiple values of 'a' from the 'data driven' query of the iterate procedure. 

In the processing query, all the customer_ids values for each 'a' will pass through the list comprehension operation because no has_master relationship has been created yet. The result will be one main node for each value of 'a' and a relationship from each new main node to each customer_id nodes corresponding to its 'a' node. 

I am not sure of the requirement.  it seems in your conceptional figure that you just are connecting each node to a common 'main' node.  How does the FUZZY_SIMILARITY relationship influent what gets connected to a main node?

Hi @glilienfield,
Thanks for your quick response. Conceptually, I want a Customer_ERP_Auto node to get connected to only one common "main" node. I use the "FUZZY_SIMILARITY" relationship to identify related nodes to a give node 'a' as I want to associate the node 'a' and all its associated nodes using "FUZZY_SIMILARITY" relationship to get connected to the same "Customer_Master" node using the "HAS_MASTER" relationship.

Appreciate if you could help me in correcting the above cypher query to fulfill this requirement.

 

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online