cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Reload data from elastic using scrollid?

Hi @michael.hunger @Thomas_Silkjaer @alicia.frame1
I load data from Elasticsearch, using scrolled

CREATE CONSTRAINT ON (cus:Customer) ASSERT cus.CIN IS UNIQUE
CREATE CONSTRAINT ON (mer:Merchant) ASSERT mer.MerchantDetailsLocation IS UNIQUE​

CALL apoc.es.query("http://localhost:9200",'index','doc','size=100&scroll=5m',null) yield value with value._scroll_id as scrollId, value.hits.hits as hits​
UNWIND hits as hit​
UNWIND hit._source AS trans​
MERGE (C1:Customer {CIN: trans.CIN})​
MERGE (M1:Merchant {MerchantName: dctrans.MerchantDetailsLocation})​
MERGE (C1)-[:Transfered {CardNo: trans.CardNo, TxnAmount: trans.TxnAmount, UniqueId: trans.UniqueId} ]->(M1)​
​
WITH range(101,10000,100) as list, scrollId​
UNWIND list as count​
CALL apoc.es.get("http://localhost:9200","_search","scroll",null,{scroll:"5m",scroll_id:scrollId},null) yield value with value._scoll_id as scrollId, value.hits.hits as nextHits​
UNWIND nextHits as hit​
UNWIND hit._source AS trans​
MERGE (C1:Customer {CIN: trans.CIN})​
MERGE (M1:Merchant {MerchantName: dctrans.MerchantDetailsLocation})​
MERGE (C1)-[:Transfered {CardNo: trans.CardNo, TxnAmount: trans.TxnAmount, UniqueId: trans.UniqueId} ]->(M1)​
​return C1, M1​

I have set Constraint to two nodes, Customer and Merchant, Customer did transfer as a relation

In Elastic I have 1million data,

Initially I loaded 10,000 using above code loading 100 at a time, then if I run the same code again to fetch another 10,000, it duplicates the relation again for the first 10,000 and goes to next 10,000.

So I have two Questions

  1. if possible, May I know how to save the scroll id for later use, so that I can run the 2nd segment only to fetch the 2nd 10,000 data?
    I tried CREATE CONSTRAINT ON ()-[r:Transfered]-() ASSERT r.UniqueId IS UNIQUE; but not working
  2. In order to avoid the relation to duplicate, may I know how to set constraint for the relation based on its property, I have properties in relation where the UniqueId is unique

Thanks in Advance

2 REPLIES 2

The first thing I notice is that you have a UNIQUE constraint on the :Merchant property MerchantDetailsLocation, but you are writing the property MerchantName. So every time you run the Merchant will be duplicated, hence also a new relationship.

So fixing the constraint will also fix your duplication issue.

Yep, Thanks that's a mistake actually.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online