cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to set up a version history for the changing data

Shashika
Node Link

I have the data source that keep updating every 2 weeks and i want to have a version history which keeps track of the changes for every two weeks. How can i do that on neo4j?

 

13 REPLIES 13

ameyasoft
Graph Maven

There are many ways to do that. It depends on your data model. Need more information to provide any answer.

Thanks alot for your reply

I have data coming from different sources like lightcast(EMSI), solaris, and katie so i have connected all these data in one graph. And for every two weeks the lightcast(EMSI) changes or is updated. In lightcast there is an API that is gives us the version changes https://api.lightcast.io/apis/skills#versions-version-changes . so i want to know how to integrate this API to the neo4j dashboard so that i can check the version. What type of information do you need on my data model? 

Thanks for the information. When you import data from lightcast, you should have created some nodes and relationships in Neo4j database. And nodes may have some properties. You can add a property named 'version' and store the version number like version: "7.0".

Q: Are you keeping the data from all versions or updating the nodes with the latest version data?

If answer is yes, then adding a property to store the version number is good enough.

If answer is no, then you are creating new nodes with every import.  If possible run  

CALL db.schema.visualization() and post the screenshot.

Thanks for your reply.
Yes the plan is to be able to have a version history so we can always see the how the data has changed over time. 
How do i add the property to store the version number?

Shashika_0-1672746612859.png

 

ameyasoft
Graph Maven

Thanks for sharing the schema diagram. I am guessing that 'Product' node is a kind of root node. If you want to keep the data for all versions, then my recommendation is to add a 'Version' node just like we do for date wise storage. (Version {number: "7.0"})-[:PRODUCT]->(Product)-[:HAS_RELATED]-(b).... See if this fits with your model. 

Thanks for being patient with me. I am new to neo4j, I am taking some time to understand the model. I dont seem to have understood this model. Can you suggest me a starting point. 
I tried the query you sent. 

 

Shashika_0-1672828779001.png

 

ameyasoft
Graph Maven

(Version {number: "7.0"})-[:PRODUCT]->(Product)-[:HAS_RELATED]-(b): This is not a Cypher query. I meant to show the sequence of nodes with 'Version' node (if implemented) as a root node. Please run this query and post the screenshot of the resulted graph.
Query: MATCH (n) RETURN n limit 20.

Thanks alot for being very patient with me and helping me. 
Please find the screenshot below

Shashika_0-1672907350278.png

 

Checking the screenshot, node 'BC'(green) is the starting node. When a new feed comes in, are using 'MERGE' or 'CREATE'. Example MERGE (a:BC {prop1: "V1"}) or CREATE (a:BC {prop1: "V1"}) Let me know. Thanks

Try this. The property values I added are demonstration purpose only.

Say version number =  7.0

merge (a:Version {version: "7.0"})
merge (b:BC {prop1: "P1"})
merge (c:BA {prop1: "A1"})
merge (d:LoB {prop1: "L1"})

merge (a)-[:HAS_BROADER]->(b)
merge (b)-[:HAS_BROADER]->(c)
merge (c)-[:HAS_BROADER]->(d)
return a, b, c,d

For version number = 7.1
merge (a:Version {version: "7.1"})
merge (b:BC {prop1: "P2"})
merge (c:BA {prop1: "A2"})
merge (d:LoB {prop1: "L2"})

merge (a)-[:HAS_BROADER]->(b)
merge (b)-[:HAS_BROADER]->(c)
merge (c)-[:HAS_BROADER]->(d)

Result:
Screen Shot 2023-01-05 at 3.28.59 PM.png

Just to mention, although the "merge" can avoid duplicate node or edge, but it take sever times than "create" if your data is large. Try to avoid merge (or change your data schema to avoid duplicate) if your data is large.

Yes i got this result. How to call an API?
Thanks alot for helping

Shashika_0-1672999451482.png

 

I just checked Merge is used.
Example: 

LOAD CSV WITH HEADERS FROM 'file:///BCM_ProductReq_LM.csv' AS row
with row
MERGE (lob:LoB {name: row.LoB})
    ON CREATE
        SET lob.LOB_ID = row.`LoB ID`,
            lob.type_id = row.`LoB Type ID`
MERGE (ba:BA {name: row.`Business Area`})
    ON CREATE
        SET ba.BA_ID = row.`Business Area ID`
MERGE (bc:BC {name: row.`Business Capability`})
    ON CREATE
        SET bc.BC_ID = row.`Business Capability ID`
MERGE (sc:SC {name: row.`Solution Capability`})
    ON CREATE
        SET sc.SC_ID = row.`Solution Capability ID`,
            sc.industry = row.`Industry / Segment`,
            sc.product = row.`Solution Capability Required Product Name`,
            sc.PPMS_ID = row.`Solution Capability Required Product ID`
    ON MATCH
        SET sc.industry = row.`Industry / Segment`,
            sc.product = row.`Solution Capability Required Product Name`,
            sc.PPMS_ID = row.`Solution Capability Required Product ID`
MERGE (p:Product {name: row.`Solution Capability Required Product Name`})
    ON CREATE
        SET p.PPMS_ID = row.`Solution Capability Required Product ID`
MERGE (p)-[rp:HAS_RELATED]->(sc)
MERGE (sc)-[rsc:HAS_BROADER]->(bc)
MERGE (bc)-[rbc:HAS_BROADER]->(ba)
MERGE (ba)-[rba:HAS_BROADER]->(lob)