Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-28-2020 01:11 PM
I am using Neo4j (with APOC) to regularly update properties and relationships.
I have to create new nodes/relationships that are not existent, and also modified those that have changed.
I got feedback from a colleague that the CYPHER was complex, and hard to follow the intricacies of the business logic (if someone else had to ever modify or work with this code). I can't necessarily say I disagree, so I'm posting here to see if there are some better methods that folks recommend.
I have ROW data from the relational database (the transactional real-world source) and the analytical data/relationships created and stored witin neo4j.
It is very often logic like this:
IF row.propertyA = node.propertyB then create a relationship, if not, then remove the relationship
IF row.propertyC IS NULL then REMOVE node.propertyC
IF row.propertyC is DIFFERENT than node.propertyC SET the new value
Can anyone suggest a more succinct way to manage this than a growing list of in CASE WHEN statements to modify these nodes, properties, and relationships as the source data changes?
Here's a sample of a growingly complex modification CYPHER transaction
// Section Update any tickets MODIFIED (BUT NOT CREATED) after the last ticket ingestion
MATCH (cdl:Crmdataload {name:'CommitCRM'})
OPTIONAL MATCH (lt:Ticket)
WITH cdl,coalesce(max(lt.createdon),0) as lastnewticket
WITH cdl,apoc.date.format(coalesce(lastnewticket,0),'ms','yyyy-MM-dd HH:mm:ss.sss') AS lastnewticket,timestamp() AS howsoonisnow,apoc.date.format(coalesce(cdl.ticketrefresh,0),'ms','yyyy-MM-dd HH:mm:ss.sss', 'CST') AS lastticketrefresh
CALL apoc.load.driver("com.extendedsystems.jdbc.advantage.ADSDriver")
call apoc.load.jdbc('jdbc:extendedsystems:advantage://commitcrm:6262;catalog=//commitcrm.mydomain.com/CommitCRM/Db','SELECT t.RECID,t.TICKETNO,t.CARDID,t.CONTACTID,
t.WORKERID,t.PRIORITY,t.CATEGORY,t.OPENDATETIME,t.CLOSEDATETIME,t.STATUS,t.PRIORITY,t.KIND,t.CREATEDATE,t.CREATEUSER,t.UPDATEDATE,t.UPDATEUSER,t.SOURCE,t.ASSETRECID,t.USER1,t.USER2,
t.DUEDATETIME,t.BCRECID,
ASCII(TRIM(SUBSTRING(l.LABELS,1,1))) AS L1,ASCII(TRIM(SUBSTRING(l.LABELS,2,1))) AS L2,ASCII(TRIM(SUBSTRING(l.LABELS,3,1))) AS L3,
ASCII(TRIM(SUBSTRING(l.LABELS,4,1))) AS L4,ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L5,ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L6,ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L7,
ASCII(TRIM(SUBSTRING(l.LABELS,5,1))) AS L8,l.LABELSLENGTH FROM Tickets as t LEFT OUTER JOIN LabelLinks as l on l.RECID = t.RECID WHERE t.CREATEDATE < \''+lastnewticket+'\' AND UPDATEDATE > \''+lastticketrefresh+'\'') YIELD row
MATCH (t:Ticket {ticketnumber:row.TICKETNO})
WITH *,COLLECT ([row.L1,row.L2,row.L3,row.L4,row.L5,row.L6,row.L7,row.L8]) as LABEL
OPTIONAL MATCH (e:Crmemployee {acctrecid:row.WORKERID})
OPTIONAL MATCH (a1:Company {acctrecid:row.CARDID})
OPTIONAL MATCH (a:Company {acctrecid:coalesce(a1.acctrecid,'None Provided')})
OPTIONAL MATCH (c1:Contact {acctrecid:row.CONTACTID})
OPTIONAL MATCH (c:Contact {acctrecid:coalesce(c1.acctrecid,'None Provided')})
OPTIONAL MATCH (ca1:Crmasset {recid:row.ASSETRECID})
OPTIONAL MATCH (ca:Crmasset {recid:coalesce(ca1.recid,'None Provided')})
OPTIONAL MATCH (tc:Crmcontract {recid:row.BCRECID})
OPTIONAL MATCH (stat1:Ticketstatus {statcode:row.STATUS})
OPTIONAL MATCH (stat:Ticketstatus {statcode:coalesce(stat1.statcode,'None Provided')})
OPTIONAL MATCH (cat1:Ticketcategory {name:row.CATEGORY})
OPTIONAL MATCH (cat:Ticketcategory {name:coalesce(cat1.name,'None Provided')})
OPTIONAL MATCH (prio1:Ticketpriority {statcode:row.PRIORITY})
OPTIONAL MATCH (prio:Ticketpriority {statcode:coalesce(prio1.statcode,'None Provided')})
OPTIONAL MATCH (t)-[mbr:MANAGED_BY]-(oe:Crmemployee)
OPTIONAL MATCH (t)-[tor:TICKET_OF_CLIENT]-(oa:Company)
OPTIONAL MATCH (t)-[tfr:TICKET_FOR]-(oc:Contact)
OPTIONAL MATCH (t)-[tsr:TICKET_SOURCE]-(ots:Ticketsource)
OPTIONAL MATCH (t)-[tcr:TICKET_CATEGORY]-(otc:Ticketcategory)
OPTIONAL MATCH (t)-[ttr:TICKET_TYPE]-(ott:Tickettype)
OPTIONAL MATCH (t)-[tcr:TICKET_CONTRACT]-(otc:Crmcontract)
OPTIONAL MATCH (t)-[tstr:TICKET_STATUS]-(otst:Ticketstatus)
OPTIONAL MATCH (t)-[tpr:TICKET_PRIORITY]-(otp:Ticketpriority)
OPTIONAL MATCH (t)-[tar:TICKET_ASSET]-(ota:Crmasset)
OPTIONAL MATCH (t)-[tlr:TICKET_HAS_LABEL]->(otl:Ticketlabel)
OPTIONAL MATCH (L1:Ticketlabel {labelcode:row.L1})
OPTIONAL MATCH (L2:Ticketlabel {labelcode:row.L2})
OPTIONAL MATCH (L3:Ticketlabel {labelcode:row.L3})
OPTIONAL MATCH (L4:Ticketlabel {labelcode:row.L4})
OPTIONAL MATCH (L5:Ticketlabel {labelcode:row.L5})
OPTIONAL MATCH (L6:Ticketlabel {labelcode:row.L6})
OPTIONAL MATCH (L7:Ticketlabel {labelcode:row.L7})
OPTIONAL MATCH (L8:Ticketlabel {labelcode:row.L8})
MERGE (src:Ticketsource {name:coalesce(row.SOURCE,'None Provided')})
MERGE (tcat:Ticketcategory {name:coalesce(row.CATEGORY,'None Provided')})
MERGE (tt:Tickettype {name:coalesce(row.KIND,'None Provided')})
WITH *
// Next, we remove relationships of the (:Ticket) that are no longer valid
FOREACH (ignoreMe in CASE WHEN not(exists(oe.acctrecid)) or (row.WORKERID <> oe.acctrecid) THEN [1] ELSE [] END | DELETE mbr)
FOREACH (ignoreMe in CASE WHEN not(exists(oa.acctrecid)) or (row.CARDID <> oa.acctrecid) THEN [1] ELSE [] END | DELETE tor)
FOREACH (ignoreMe in CASE WHEN not(exists(oc.acctrecid)) or (row.CONTACTID <> oc.acctrecid) THEN [1] ELSE [] END | DELETE tfr)
FOREACH (ignoreMe in CASE WHEN not(exists(ots.name)) or (row.SOURCE <> ots.name) THEN [1] ELSE [] END | DELETE tsr)
FOREACH (ignoreMe in CASE WHEN not(exists(otc.name)) or (row.BCRECID <> otc.recid) THEN [1] ELSE [] END | DELETE tcr)
FOREACH (ignoreMe in CASE WHEN not(exists(ott.name)) or (row.KIND <> ott.name) THEN [1] ELSE [] END | DELETE ttr)
FOREACH (ignoreMe in CASE WHEN not(exists(otst.statcode)) or (stat.statcode <> otst.statcode) THEN [1] ELSE [] END | DELETE tstr)
FOREACH (ignoreMe in CASE WHEN not(exists(otp.statcode)) or (row.STATUS <> otp.statcode) THEN [1] ELSE [] END | DELETE tpr)
FOREACH (ignoreMe in CASE WHEN not(exists(otc.name)) or (row.CATEGORY <> otc.name) THEN [1] ELSE [] END | DELETE tcr)
FOREACH (ignoreMe in CASE WHEN not(exists(ota.recid)) or (row.ASSETRECID <> ota.recid) THEN [1] ELSE [] END | DELETE tar)
FOREACH (ignoreMe in CASE WHEN NOT (otl.labelcode in (LABEL)) THEN [1] ELSE [] END | DELETE tlr)
// Next, we modify properties on the (:Ticket) node that have changed
FOREACH (ignoreMe in CASE WHEN coalesce(t.manager,'isnull') <> coalesce(e.name,'isnull') THEN [1] ELSE [] END | SET t.manager=e.name)
//FOREACH (ignoreMe in CASE WHEN coalesce(t.recid,'isnull') <> coalesce(row.RECID,'isnull') THEN [1] ELSE [] END | SET t.recid=row.RECID)
FOREACH (ignoreMe in CASE WHEN coalesce(t.openedon,'isnull') <> coalesce(row.OPENDATETIME,'isnull') THEN [1] ELSE [] END | SET t.openedon=datetime(row.OPENDATETIME).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.closedon,'isnull') <> coalesce(row.CLOSEDATETIME,'isnull') THEN [1] ELSE [] END | SET t.closedon=datetime(row.CLOSEDATETIME).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.createdon,'isnull') <> coalesce(row.CREATEDATE,'isnull') THEN [1] ELSE [] END | SET t.createdon=datetime(row.CREATEDATE).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.updatedon,'isnull') <> coalesce(row.UPDATEDATE,'isnull') THEN [1] ELSE [] END | SET t.updatedon=datetime(row.UPDATEDATE).epochmillis)
FOREACH (ignoreMe in CASE WHEN coalesce(t.service,'isnull') <> coalesce(row.USER1,'isnull') THEN [1] ELSE [] END | SET t.service=row.USER1)
FOREACH (ignoreMe in CASE WHEN coalesce(t.host,'isnull') <> coalesce(row.USER2,'isnull') THEN [1] ELSE [] END | SET t.host=row.USER2)
FOREACH (ignoreMe in CASE WHEN coalesce(t.account,'isnull') <> coalesce(a.name,'isnull') THEN [1] ELSE [] END | SET t.account=a.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.contact,'isnull') <> coalesce(c.name,'isnull') THEN [1] ELSE [] END | SET t.contact=c.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.source,'isnull') <> coalesce(src.name,'isnull') THEN [1] ELSE [] END | SET t.source=src.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.type,'isnull') <> coalesce(tt.name,'isnull') THEN [1] ELSE [] END | SET t.type=tt.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.status,'isnull') <> coalesce(stat.name,'isnull') THEN [1] ELSE [] END | SET t.status=stat.name)
FOREACH (ignoreMe in CASE WHEN coalesce(t.priority,'isnull') <> coalesce(prio.name,'isnull') THEN [1] ELSE [] END | SET t.priority=prio.name)
// Finally, we create any missing relationships for the (:Ticket)
FOREACH (ignoreMe in CASE WHEN exists(e.acctrecid) THEN [1] ELSE [] END | MERGE (t)-[:MANAGED_BY]->(e))
FOREACH (ignoreMe in CASE WHEN exists(a.acctrecid) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_OF_CLIENT]->(a))
FOREACH (ignoreMe in CASE WHEN exists(c.acctrecid) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_FOR]->(c))
FOREACH (ignoreMe in CASE WHEN exists(src.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_SOURCE]->(src))
FOREACH (ignoreMe in CASE WHEN exists(tt.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_TYPE]->(tt))
FOREACH (ignoreMe in CASE WHEN exists(stat.statcode) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_STATUS]->(stat))
FOREACH (ignoreMe in CASE WHEN exists(tcat.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_CATEGORY]->(tcat))
FOREACH (ignoreMe in CASE WHEN exists(tc.recid) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_CONTRACT]->(tc))
FOREACH (ignoreMe in CASE WHEN exists(prio.statcode) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_PRIORITY]->(prio))
FOREACH (ignoreMe in CASE WHEN exists(ca.name) THEN [1] ELSE [] END | MERGE (t)-[:TICKET_ASSET]->(ca))
FOREACH (ignoreMe in CASE WHEN row.L1<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L1))
FOREACH (ignoreMe in CASE WHEN row.L2<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L2))
FOREACH (ignoreMe in CASE WHEN row.L3<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L3))
FOREACH (ignoreMe in CASE WHEN row.L4<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L4))
FOREACH (ignoreMe in CASE WHEN row.L5<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L5))
FOREACH (ignoreMe in CASE WHEN row.L6<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L6))
FOREACH (ignoreMe in CASE WHEN row.L71<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L7))
FOREACH (ignoreMe in CASE WHEN row.L8<>0 THEN [1] ELSE [] END | MERGE (t)-[:TICKET_HAS_LABEL]->(L8))
// SET the timestamp for when this refresh procedure last ran
SET cdl.ticketrefresh=howsoonisnow
RETURN count(t) AS `Updated (:Ticket) when data or relationships have changed`;
01-28-2020 11:32 PM
To me this sounds like an excellent example of when to write your own procedures.
( with all due respect to cypher 🙂 )
01-29-2020 03:22 PM
@pdrangeid clone this repo and you have a working plugin with a couple procedures that may or may not be useful for you.
that + the apoc repo and from there you'll be copy paste build test ( rinse repeat ) If you know any other lang it should be a breeze. feel free to ping for more specific stuff ( dm, slack, this community, stack overflow...... ) you'll be done in no time
01-29-2020 07:12 AM
I am intrigued! But also have exactly 0 experience as a JAVA coder, so any advice on how to approach this?
I took a quick look here: https://neo4j.com/blog/intro-user-defined-procedures-apoc/
I should probably spend a bit more time reviewing the list of functions in the new release of APOC 4, as maybe there's already something there that would help me.
01-29-2020 01:06 PM
There are some approaches that I have taken when it comes to complex CYPHER scripts. One of the ones that I currently favour is using NiFi to coordinate the data flow including the many elements of the business logic. I chose this recently when NiFi started to support graph queries.
If you are open to using something like this, it is more simple than learning how to write your own custom procedures.
yyyguy
01-29-2020 02:04 PM
I have recently been learning/using kettle ... It's not always clear where this sort of business logic should live. (ETL, cypher, custom code or scripts)..
I'd like to start moving to some more well-thought best practices, just not always sure what they are!
Thanks for the suggestion.
All the sessions of the conference are now available online