Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-11-2018 02:01 PM
I am importing four json files using a process similar to that described by Michael Hunger.
Three of the files load successfully but one is giving me trouble because I can't get the cypher logic right. I feel like I need to use a case statement but that doesn't seem to be useful for choosing between different merge strategies. The file describes dataflows in a computer system. The json is an array of objects where each object describes a flow of one of three types: a database connection, a restful api, and a async-queue producer/consumer. The particular file is available online:
http://context.asia.wiki.org/plugin/json/dataflow-diagram
My Cypher code tries to use CASE to choose which merges to apply.
unwind {json} as obj
case obj.type
when 'store' then
create (d:DB {store: obj.store})
merge (f:PROGRAM {name: obj.from})
merge (f)-[:STORE]->(d)
when ...
This fails with the message:
Invalid input 's': expected 'l/L' (line 2, column 3 (offset: 23))
"case obj.type"
^
Now having studied this grammar:
https://s3.amazonaws.com/artifacts.opencypher.org/M12/railroad/CaseExpression.html
I realize case was never intended to choose between merge queries. I'm thinking now I have to create incomplete nodes with only the type field and then run a series of match-merge sequences, one for each type in my json. Or is there a better way?
I should mention that I'm running these Cypher scripts as part of a Docker build.
11-11-2018 04:22 PM
I think I have acceptable logic now but it appears to be surprisingly inefficient running 100x slower than my other cypher for similarly sized json file. Here is my approach (be sure to scroll to see every line):
create (s:SOURCE {
name:'dataflow-diagram',
title:'Dataflow Diagram'})
with s
unwind {json} as obj
create (c:CASE)
set c=obj
with s
match (c:CASE {type:'store'})
merge (d:DB {name:c.store})
merge (g:PROGRAM {name:c.from})
merge (g)-[:QUERIES {source:s.name}]->(d)
with s
match (c:CASE {type:'rest'})
merge (f:PROGRAM {name:c.from})
merge (t:PROGRAM {name:c.to})
merge (f)-[:CALLS {source:s.name}]->(t)
with s
match (c:CASE {type:'queue'})
create (q:QUEUE {name:c.queue})
merge (r:PROGRAM {name:c.read})
merge (r)-[:READS {source:s.name}]->(q)
merge (w:PROGRAM {name:c.write})
merge (w)-[:WRITES {source:s.name}]->(q)
I've included the source attribution that I omitted in my previous code as a simplification. Now I include it so that I am directly cutting and pasting my work. This is the terminal output I get while loading my data one file at a time.
organization-chart
real 0m 1.46s
user 0m 0.00s
sys 0m 0.00s
source-code-control
real 0m 1.19s
user 0m 0.00s
sys 0m 0.00s
dataflow-diagram
Cannot merge node using null property value for name
real 1m 49.67s
user 0m 0.00s
sys 0m 0.00s
service-traffic-reports
real 0m 0.50s
user 0m 0.00s
sys 0m 0.00s
I can't explain my "null property value" error either. I've inspected my data with jq and haven't found anything amiss. I could be struggling too long. Time to set this project aside.
11-13-2018 08:00 AM
11-13-2018 03:04 PM
Hi Ward -
There is a bit of a hack you can use for this sort of conditional logic that involves creating dummy lists using CASE
and then iterating over them. Something like this:
create (s:SOURCE {
name:'dataflow-diagram',
title:'Dataflow Diagram'})
with s
UNWIND {json} AS obj
WITH *,
CASE WHEN obj.type == 'store' THEN [1] ELSE [] END AS stores,
CASE WHEN obj.type == 'rest' THEN [1] ELSE [] END AS rests,
CASE WHEN obj.type == 'queue' THEN [1] ELSE [] END AS queues
FOREACH (_ IN stores |
MERGE (d:DB {name: obj.store})
MERGE (g:PROGRAM {name: obj.from})
MERGE (g)-[:QUERIES {source: s.name}]->(d)
)
FOREACH (_ IN rests |
MERGE (f:PROGRAM {name: obj.from})
MERGE (t:PROGRAM {name: obj.to})
MERGE (f)-[:CALLS {source: s.name}]->(t)
)
FOREACH (_ IN queues |
CREATE (q:QUEUE {name: obj.queue})
MERGE (r:PROGRAM {name: obj.read})
MERGE (r)-[:READS {source: s.name}]->(q)
MERGE (w:PROGRAM {name: obj.write})
MERGE (w)-[:WRITES {source: s.name}]->(q)
)
Also, do you have uniqueness constraints created for any of the properties you are merging on?
11-13-2018 10:49 PM
Thanks for the tip William. The decoding section that I finally used distinguished queue reads from queue writes. This was my source of null values since only one or the other was present. There was also a fourth type, calls to external sites.
CASE WHEN obj.type='store'
THEN [1] ELSE [] END AS stores,
CASE WHEN obj.type='rest'
THEN [1] ELSE [] END AS rests,
CASE WHEN obj.type='queue'
and obj.read is not null
THEN [1] ELSE [] END AS reads,
CASE WHEN obj.type='queue'
and obj.write is not null
THEN [1] ELSE [] END AS writes,
CASE WHEN obj.type='site'
THEN [1] ELSE [] END AS sites
This version ran in a half second which was consistent with the other files. I wonder what my version was doing with all of those seconds?
dataflow-diagram
real 0m 0.59s
user 0m 0.00s
sys 0m 0.00s
All the sessions of the conference are now available online