cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Increasingly complex CYPHER: better methods to UPDATE graph data?

pdrangeid
Graph Voyager

I am using Neo4j (with APOC) to regularly update properties and relationships.
I have to create new nodes/relationships that are not existent, and also modified those that have changed.

I got feedback from a colleague that the CYPHER was complex, and hard to follow the intricacies of the business logic (if someone else had to ever modify or work with this code). I can't necessarily say I disagree, so I'm posting here to see if there are some better methods that folks recommend.

I have ROW data from the relational database (the transactional real-world source) and the analytical data/relationships created and stored witin neo4j.

It is very often logic like this:
IF row.propertyA = node.propertyB then create a relationship, if not, then remove the relationship
IF row.propertyC IS NULL then REMOVE node.propertyC
IF row.propertyC is DIFFERENT than node.propertyC SET the new value

Can anyone suggest a more succinct way to manage this than a growing list of in CASE WHEN statements to modify these nodes, properties, and relationships as the source data changes?

Here's a sample of a growingly complex modification CYPHER transaction

// Section Update any tickets MODIFIED (BUT NOT CREATED) after the last ticket ingestion
MATCH (cdl:Crmdataload {name:'CommitCRM'})
OPTIONAL MATCH (lt:Ticket)
WITH cdl,coalesce(max(lt.createdon),0) as lastnewticket
WITH cdl,apoc.date.format(coalesce(lastnewticket,0),'ms','yyyy-MM-dd HH:mm:ss.sss') AS lastnewticket,timestamp() AS howsoonisnow,apoc.date.format(coalesce(cdl.ticketrefresh,0),'ms','yyyy-MM-dd HH:mm:ss.sss', 'CST') AS lastticketrefresh
CALL apoc.load.driver("com.extendedsystems.jdbc.advantage.ADSDriver")
call apoc.load.jdbc('jdbc:extendedsystems:advantage://commitcrm:6262;catalog=//commitcrm.mydomain.com/CommitCRM/Db','SELECT t.RECID,t.TICKETNO,t.CARDID,t.CONTACTID,
t.WORKERID,t.PRIORITY,t.CATEGORY,t.OPENDATETIME,t.CLOSEDATETIME,t.STATUS,t.PRIORITY,t.KIND,t.CREATEDATE,t.CREATEUSER,t.UPDATEDATE,t.UPDATEUSER,t.SOURCE,t.ASSETRECID,t.USER1,t.USER2,
t.DUEDATETIME,t.BCRECID,
ASCII(TRIM(SUBSTRING(l.LABELS,1,1))) AS L1,ASCII(TRIM(SUBSTRING(l.LABELS,2,1))) AS L2,ASCII(TRIM(SUBSTRING(l.LABELS,3,1))) AS L3,
ASCII(TRIM(SUBSTRING(l.LABELS,4,1))) AS L4,ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L5,ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L6,ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L7,
ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L8,l.LABELSLENGTH FROM Tickets as t LEFT OUTER JOIN LabelLinks as l on l.RECID = t.RECID WHERE t.CREATEDATE < \''+lastnewticket+'\' AND UPDATEDATE > \''+lastticketrefresh+'\'') YIELD row
MATCH (t:Ticket {ticketnumber:row.TICKETNO})
WITH *,COLLECT ([row.L1,row.L2,row.L3,row.L4,row.L5,row.L6,row.L7,row.L8]) as LABEL
OPTIONAL MATCH (e:Crmemployee {acctrecid:row.WORKERID})
OPTIONAL MATCH (a1:Company {acctrecid:row.CARDID})
OPTIONAL MATCH (a:Company {acctrecid:coalesce(a1.acctrecid,'None Provided')})
OPTIONAL MATCH (c1:Contact {acctrecid:row.CONTACTID})
OPTIONAL MATCH (c:Contact {acctrecid:coalesce(c1.acctrecid,'None Provided')})
OPTIONAL MATCH (ca1:Crmasset {recid:row.ASSETRECID})
OPTIONAL MATCH (ca:Crmasset {recid:coalesce(ca1.recid,'None Provided')})
OPTIONAL MATCH (tc:Crmcontract {recid:row.BCRECID})
OPTIONAL MATCH (stat1:Ticketstatus {statcode:row.STATUS})
OPTIONAL MATCH (stat:Ticketstatus {statcode:coalesce(stat1.statcode,'None Provided')})
OPTIONAL MATCH (cat1:Ticketcategory {name:row.CATEGORY})
OPTIONAL MATCH (cat:Ticketcategory {name:coalesce(cat1.name,'None Provided')})
OPTIONAL MATCH (prio1:Ticketpriority {statcode:row.PRIORITY})
OPTIONAL MATCH (prio:Ticketpriority {statcode:coalesce(prio1.statcode,'None Provided')})
OPTIONAL MATCH (t)-[mbr:MANAGED_BY]-(oe:Crmemployee)
OPTIONAL MATCH (t)-[tor:TICKET_OF_CLIENT]-(oa:Company)
OPTIONAL MATCH (t)-[tfr:TICKET_FOR]-(oc:Contact)
OPTIONAL MATCH (t)-[tsr:TICKET_SOURCE]-(ots:Ticketsource)
OPTIONAL MATCH (t)-[tcr:TICKET_CATEGORY]-(otc:Ticketcategory)
OPTIONAL MATCH (t)-[ttr:TICKET_TYPE]-(ott:Tickettype)
OPTIONAL MATCH (t)-[tcr:TICKET_CONTRACT]-(otc:Crmcontract)
OPTIONAL MATCH (t)-[tstr:TICKET_STATUS]-(otst:Ticketstatus)
OPTIONAL MATCH (t)-[tpr:TICKET_PRIORITY]-(otp:Ticketpriority)
OPTIONAL MATCH (t)-[tar:TICKET_ASSET]-(ota:Crmasset)
OPTIONAL MATCH (t)-[tlr:TICKET_HAS_LABEL]->(otl:Ticketlabel)
OPTIONAL MATCH (L1:Ticketlabel {labelcode:row.L1})
OPTIONAL MATCH (L2:Ticketlabel {labelcode:row.L2})
OPTIONAL MATCH (L3:Ticketlabel {labelcode:row.L3})
OPTIONAL MATCH (L4:Ticketlabel {labelcode:row.L4})
OPTIONAL MATCH (L5:Ticketlabel {labelcode:row.L5})
OPTIONAL MATCH (L6:Ticketlabel {labelcode:row.L6})
OPTIONAL MATCH (L7:Ticketlabel {labelcode:row.L7})
OPTIONAL MATCH (L8:Ticketlabel {labelcode:row.L8})
MERGE (src:Ticketsource {name:coalesce(row.SOURCE,'None Provided')})
MERGE (tcat:Ticketcategory {name:coalesce(row.CATEGORY,'None Provided')})
MERGE (tt:Tickettype {name:coalesce(row.KIND,'None Provided')})
WITH *
// Next, we remove relationships of the (:Ticket) that are no longer valid
FOREACH (ignoreMe in CASE WHEN not(exists(oe.acctrecid)) or (row.WORKERID <> oe.acctrecid) THEN [1] ELSE [] END | DELETE mbr)
FOREACH (ignoreMe in CASE WHEN not(exists(oa.acctrecid)) or (row.CARDID <> oa.acctrecid) THEN [1] ELSE [] END | DELETE tor)
FOREACH (ignoreMe in CASE WHEN not(exists(oc.acctrecid)) or (row.CONTACTID <> oc.acctrecid)  THEN [1] ELSE [] END | DELETE tfr)
FOREACH (ignoreMe in CASE WHEN not(exists(ots.name)) or (row.SOURCE <> ots.name)  THEN [1] ELSE [] END | DELETE tsr)
FOREACH (ignoreMe in CASE WHEN not(exists(otc.name)) or (row.BCRECID <> otc.recid)  THEN [1] ELSE [] END | DELETE tcr)
FOREACH (ignoreMe in CASE WHEN not(exists(ott.name)) or (row.KIND <> ott.name)  THEN [1] ELSE [] END | DELETE ttr)
FOREACH (ignoreMe in CASE WHEN not(exists(otst.statcode)) or (stat.statcode <> otst.statcode)  THEN [1] ELSE [] END | DELETE tstr)
FOREACH (ignoreMe in CASE WHEN not(exists(otp.statcode)) or (row.STATUS <> otp.statcode)  THEN [1] ELSE [] END | DELETE tpr)
FOREACH (ignoreMe in CASE WHEN not(exists(otc.name)) or (row.CATEGORY <> otc.name)  THEN [1] ELSE [] END | DELETE tcr)
FOREACH (ignoreMe in CASE WHEN not(exists(ota.recid)) or (row.ASSETRECID <> ota.recid)  THEN [1] ELSE [] END | DELETE tar)
FOREACH (ignoreMe in CASE WHEN NOT (otl.labelcode in (LABEL)) THEN [1] ELSE [] END | DELETE tlr)
// Next, we modify properties on the (:Ticket) node that have changed
FOREACH (ignoreMe in CASE WHEN coalesce(t.manager,'isnull') <> coalesce(e.name,'isnull') THEN [1] ELSE [] END | SET t.manager=e.name)
//FOREACH (ignoreMe in CASE WHEN coalesce(t.recid,'isnull') <> coalesce(row.RECID,'isnull') THEN [1] ELSE [] END | SET t.recid=row.RECID)
FOREACH (ignoreMe in CASE WHEN coalesce(t.openedon,'isnull') <> coalesce(row.OPENDATETIME,'isnull') THEN [1] ELSE [] END | SET t.openedon=datetime(row.OPENDATETIME).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.closedon,'isnull') <> coalesce(row.CLOSEDATETIME,'isnull') THEN [1] ELSE [] END | SET t.closedon=datetime(row.CLOSEDATETIME).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.createdon,'isnull') <> coalesce(row.CREATEDATE,'isnull') THEN [1] ELSE [] END | SET t.createdon=datetime(row.CREATEDATE).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.updatedon,'isnull') <> coalesce(row.UPDATEDATE,'isnull') THEN [1] ELSE [] END | SET t.updatedon=datetime(row.UPDATEDATE).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.service,'isnull') <> coalesce(row.USER1,'isnull') THEN [1] ELSE [] END | SET t.service=row.USER1)
FOREACH (ignoreMe in CASE WHEN coalesce(t.host,'isnull') <> coalesce(row.USER2,'isnull') THEN [1] ELSE [] END | SET t.host=row.USER2)
FOREACH (ignoreMe in CASE WHEN coalesce(t.account,'isnull') <> coalesce(a.name,'isnull') THEN [1] ELSE [] END | SET t.account=a.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.contact,'isnull') <> coalesce(c.name,'isnull') THEN [1] ELSE [] END | SET t.contact=c.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.source,'isnull') <> coalesce(src.name,'isnull') THEN [1] ELSE [] END | SET t.source=src.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.type,'isnull') <> coalesce(tt.name,'isnull') THEN [1] ELSE [] END | SET t.type=tt.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.status,'isnull') <> coalesce(stat.name,'isnull') THEN [1] ELSE [] END | SET t.status=stat.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.priority,'isnull') <> coalesce(prio.name,'isnull') THEN [1] ELSE [] END | SET t.priority=prio.name)
// Finally, we create any missing relationships for the (:Ticket)
FOREACH (ignoreMe in CASE WHEN exists(e.acctrecid) THEN [1] ELSE [] END | MERGE (t)-[:MANAGED_BY]->(e))
FOREACH (ignoreMe in CASE WHEN exists(a.acctrecid) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_OF_CLIENT]->(a))
FOREACH (ignoreMe in CASE WHEN exists(c.acctrecid) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_FOR]->(c))
FOREACH (ignoreMe in CASE WHEN exists(src.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_SOURCE]->(src))
FOREACH (ignoreMe in CASE WHEN exists(tt.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_TYPE]->(tt))
FOREACH (ignoreMe in CASE WHEN exists(stat.statcode) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_STATUS]->(stat))
FOREACH (ignoreMe in CASE WHEN exists(tcat.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_CATEGORY]->(tcat))
FOREACH (ignoreMe in CASE WHEN exists(tc.recid) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_CONTRACT]->(tc))
FOREACH (ignoreMe in CASE WHEN exists(prio.statcode) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_PRIORITY]->(prio))
FOREACH (ignoreMe in CASE WHEN exists(ca.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_ASSET]->(ca))
FOREACH (ignoreMe in CASE WHEN row.L1<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L1))
FOREACH (ignoreMe in CASE WHEN row.L2<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L2))
FOREACH (ignoreMe in CASE WHEN row.L3<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L3))
FOREACH (ignoreMe in CASE WHEN row.L4<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L4))
FOREACH (ignoreMe in CASE WHEN row.L5<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L5))
FOREACH (ignoreMe in CASE WHEN row.L6<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L6))
FOREACH (ignoreMe in CASE WHEN row.L71<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L7))
FOREACH (ignoreMe in CASE WHEN row.L8<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L8))
// SET the timestamp for when this refresh procedure last ran
SET cdl.ticketrefresh=howsoonisnow
RETURN count(t) AS `Updated (:Ticket) when data or relationships have changed`;
5 REPLIES 5

accounts
Node Clone

To me this sounds like an excellent example of when to write your own procedures.
( with all due respect to cypher 🙂 )

@pdrangeid clone this repo and you have a working plugin with a couple procedures that may or may not be useful for you.
that + the apoc repo and from there you'll be copy paste build test ( rinse repeat ) If you know any other lang it should be a breeze. feel free to ping for more specific stuff ( dm, slack, this community, stack overflow...... ) you'll be done in no time

pdrangeid
Graph Voyager

I am intrigued! But also have exactly 0 experience as a JAVA coder, so any advice on how to approach this?

I took a quick look here: https://neo4j.com/blog/intro-user-defined-procedures-apoc/

I should probably spend a bit more time reviewing the list of functions in the new release of APOC 4, as maybe there's already something there that would help me.

yyyguy
Graph Buddy

There are some approaches that I have taken when it comes to complex CYPHER scripts. One of the ones that I currently favour is using NiFi to coordinate the data flow including the many elements of the business logic. I chose this recently when NiFi started to support graph queries.

If you are open to using something like this, it is more simple than learning how to write your own custom procedures.

yyyguy

I have recently been learning/using kettle ... It's not always clear where this sort of business logic should live. (ETL, cypher, custom code or scripts)..

I'd like to start moving to some more well-thought best practices, just not always sure what they are!

Thanks for the suggestion.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online