cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to compare 2 different versions (basically JSONs) from neo4j

I am designing an application where there is some amount of data stored in neo4j as different nodes and relationships. Depending on some other activities after 30 minutes there can be new set of data will come.

In the new data there are chances that some of nodes/relationships may be added or removed. there are chances that properties are also updated.

I am in the designing phase of the database so would like to know how can i achieve this functionality in effective way ?

1 REPLY 1

jo_nathan
Node Clone

Let's revive this.

Hey Kishan, it must have been a great dissapointment - your first (one and only) post got no reply.
Anyway, I hope you'll get a mail about my reply.
I am asking this question too.
Did you solve it?

I see the following approaches:

Drop database/detach delete all.

Import freshly. Only suitable with little data. Terrible on a single instance (Downtime).
Might be useful in a cluster / container / kubernetes scenario where instances can go down and users won't notice. Still, not a good idea.

Using Merge-Keyword

Create constraints first! Use the merge keyword and just load your data again. For properties: Override every property. Take the same actions on create and on match.

call apoc.load.json("://file")
yield value unwind as row //something like this
merge (node:label {id:row.id}) 
on match set 
node.name=row.name
on create set
node.name=row.name

You'll have to deal with relationships. You could create relationships with a different name from existing ones, delete old relationships, rename new relationship name to old name.
I am not sure about relationships.

Use diff

You'll need another programming language or user defined procedures.
Save the .json after loading data.
The next time, before you load data, make a diff between new and old .json.
Write according cypher-queries for added, removed and changed items.

For example, you could use jq/jd
1. jq
2. jd
or CSV-diff for python.

Kafka

Kafka can catch changes continously. See:
Kafka Connector

Call to action

If anyone knows a better way feel free to improve my answer!