cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Ability to determine deleted nodes/relationships?

mbandor
Graph Voyager

I've been working on an import script (APOC based) to input the contents of a customer built Excel spreadsheet and built the nodes & relationships. So far that has gone well for the initial import and creation of the graph. As this spreadsheet is updated monthly by the customer, I originally was going to use a modified version of the import script to update the graph. I should be able to check for the existence of the nodes and relationships just fine (ON CREATE, ON MERGE, etc.), however how should I address if the prior node/relationship no longer exists in the update (likely possibility)? It almost sounds like the better (easier) option is just to rebuilt the graph from each monthly update from the customer. This is basically a reverse parsing situation (parse the graph and compare with the spreadsheet).

Your thoughts?

5 REPLIES 5

dkm1006
Node Clone

If you don’t want to delete all nodes and relationships, then create all of them again, you could add a property lastUpdated which is set during the update-import. After the import-update you could thus delete only the nodes which where not updated within the last 24 hours or so.

I do have a Last_Updated property on the nodes. Where things get interesting is information may not necessarily change for months/years so I don't want to induce the potential for false positives when doing a query due to old information still contained in the graph (e.g., a product no longer being tracked for obsolescence).

dkm1006
Node Clone

Maybe we had a misunderstanding. To make clear what I meant, let’s call the proposed property last_imported. As the last step of your import script you would then run something like

MATCH (n) WHERE n.last_imported < datetime() - duration({hours:24})
CALL { 
    WITH n
    DETACH DELETE n
} IN TRANSACTIONS OF 1000 ROWS

Hmm, I hadn't considered this as the last step. You might be onto something. Thanks for the suggestion!

dkm1006
Node Clone

You’re welcome 🙂 Don’t forget to put an index on last_imported if you go along with this solution. That will speed up the deletion step considerably, I think.