Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-25-2021 08:02 PM
I am chasing a pattern of data like this:
I can write the query for it pretty easily with the following:
MATCH (from:Tank)<-[:DRAINS]-(l: Transfer)-[:FEEDS]->(to: Tank)
MATCH (l)-[:IS]->(o:Code)
WHERE from.community_id = 1043
RETURN from, to, l, o
But when I try to aggregate on the data, because tank 3063 in the center is both a from
and a to
tank, I am getting duplicate results for it in any of my aggregations. I try to WITH DISTINCT
it back to one, but because it is connected to two legs it is coming back twice still. Is there something obviously wrong with the query I have written? I feel like I have tried a hundred different versions at this point...
Here is the most recent attempt at a query:
MATCH (from:Tank)<-[:DRAINS]-(l:Transfer)-[:FEEDS]->(to: Tank)
WHERE from.community_id = 1043
WITH DISTINCT COLLECT(from) + COLLECT(to) as tanks, l
MATCH (l)-[:IS]->(o:Code)
UNWIND tanks as tank
WITH sum(tank.flow) AS netFlow, sum(o.weight) as codeWeight, tank.community_id as community,
RETURN community, netFlow, codeWeight
I appreciate any second set of eyes looking at my query!
Solved! Go to Solution.
03-26-2021 05:08 PM
One minor change and we have a winner!
The only issue was when you had more items in the graph, it wasn't grouping each subgraph by the community_id. So I added a.commiunity_id to the first WITH statement and it worked perfectly
match (l:Leg)-[]-(a:Tank)
with collect(distinct a.name) as a1, a.community_id as cid
match (l:Leg)-[]-(b:Tank)
where b.name in a1
match (l)-[]-(o:Code)
with distinct b.community_id as cid, sum( distinct b.flow) AS netFlow, sum(distinct o.weight) as codeWeight, a1, COLLECT(DISTINCT o.name) as opcode
return cid, a1, netFlow, codeWeight, opcode
I adjusted it so that each leg has a different code as well and it still outputs as expected!
Thank you all so much for the help!
03-26-2021 11:45 AM
I'm not entirely sure what you want (or what you are getting that isn't correct).
could you provide a small example dataset (CREATE statements), and the text of desired output?
03-26-2021 01:13 PM
This will create the sub-graph of data I am working with (There will actually be tons of these little subgraphs identified by the community_id, I am just working with one as an example)
CREATE (:Tank {name: 6016, flow: -150, community_id: 1})<-[:DRAINS]-(l1:Transfer {name: 5, community_id:1})-[:FEEDS]->(:Tank {name: 3063, flow: 200, community_id: 1})<-[:FEEDS]-(l2: Transfer {name: 3, community_id: 1})-[:DRAINS]-> (:Tank {name: 3013, flow: -50, community_id: 1}), (l1) -[:IS]->(o:Code {name: 6206, weight:100})<-[:IS]-(l2)
My desired output is the following
╒═══════════╤═════════════════════════════╤═════════════════╤══════════════╤═════════════════════════════╕
│"community"│"equipment" │"netFlow" │"opCodeWeight"│"opcodes" │
╞═══════════╪═════════════════════════════╪═════════════════╪══════════════╪═════════════════════════════╡
│1043 │["3013","3063","6016"]. │0.0 │100 │["6206"] │
└───────────┴─────────────────────────────┴─────────────────┴──────────────┴─────────────────────────────┘
This is what I am currently getting:
╒═══════════╤═════════════════════════════╤═════════════════╤══════════════╤═════════════════════════════╕
│"community"│"equipment" │"netFlow" │"opCodeWeight"│"opcodes" │
╞═══════════╪═════════════════════════════╪═════════════════╪══════════════╪═════════════════════════════╡
│1043 │["3013","3063","6016","3063"] │-100 │400 │["6206","6206","6206","6206"] │
└───────────┴─────────────────────────────┴─────────────────┴──────────────┴─────────────────────────────┘
also I don't know if there is an easier way to put a table in here with mark up, but Geeze formatting that correctly (even after copying it out of neo4j) sucked... lol
Thanks for any help!
03-26-2021 01:12 PM
Try this:
MATCH (from:Tank)<-[:DRAINS]-(l:Transfer)-[:FEEDS]->(to: Tank)
WHERE from.community_id = 1043
MATCH (l)-[:IS]->(o:Code)
MATCH (from1:Tank)<-[:DRAINS]-(l1:Transfer))-[:FEEDS]->(to)
WHERE from1.community_id <> from.community_id
MATCH (l1)-[:IS]->(o)
WITH COLLECT(DISTINCT from) + COLLECT(DISTINCT from1) as tanks, o
UNWIND tanks as tank
WITH sum(tank.flow) AS netFlow, sum(o.weight) as codeWeight, tank.community_id as community
RETURN community, netFlow, codeWeight
03-26-2021 01:20 PM
This one fails because "tanks" contains: ["3013","6016","6016","3013"] and I end up with duplicates, I essentially need the "tanks" collection to include ["3013", "3063", "6016"]
EDIT:
Also, this part WHERE from1.community_id <> from.community_id
breaks it too because all three tanks and both legs should ALL have the same community ID, I am identifying them as a group with gds.wcc.write and that show I am grouping on them to identify the flow of the whole group.
03-26-2021 02:07 PM
It's not easier, but to get a clean table I export to csv, then copy paste from the app I open the csv with (e.g. libreoffice, vi)
03-26-2021 04:04 PM
Try this:
match (l:Transfer)-[]-(a:Tank)
with collect(distinct a.name) as a1
match (l1:Transfer)-[]-(b:Tank)
where b.name in a1
match (l1)-[]-(o:Code)
with distinct b.community_id as cid, sum( distinct b.flow) AS netFlow, sum(distinct o.weight) as codeWeight, a1, o.name as opcode
return cid, a1, netFlow, codeWeight, opcode
Result:
03-26-2021 05:08 PM
One minor change and we have a winner!
The only issue was when you had more items in the graph, it wasn't grouping each subgraph by the community_id. So I added a.commiunity_id to the first WITH statement and it worked perfectly
match (l:Leg)-[]-(a:Tank)
with collect(distinct a.name) as a1, a.community_id as cid
match (l:Leg)-[]-(b:Tank)
where b.name in a1
match (l)-[]-(o:Code)
with distinct b.community_id as cid, sum( distinct b.flow) AS netFlow, sum(distinct o.weight) as codeWeight, a1, COLLECT(DISTINCT o.name) as opcode
return cid, a1, netFlow, codeWeight, opcode
I adjusted it so that each leg has a different code as well and it still outputs as expected!
Thank you all so much for the help!
03-26-2021 03:02 PM
I'm pretty sure I still don't understand what we're doing here, but I imagine you could rapidly narrow the search space by also adding group id to the query, so you only consider the group you are working with. Here is a first attempt, I didn't know why there was a second path search out to code so I just dropped it.
MATCH (from:Tank)<-[:DRAINS]-(l:Transfer)-[:FEEDS]->(to: Tank)
WHERE from.community_id = 1
MATCH (l)-[:IS]->(o:Code)
WITH o, collect(distinct from)+collect(distinct to) as tanks
UNWIND tanks as t
return t.community_id as community, collect(distinct t.name) as equipment, sum(t.flow) as netFlow, sum(o.weight) as opCodeWeight, collect(distinct o.name) as opcodes
output
community equipment netFlow opCodeWeight opcodes
1 [6016,3013,3063] 0 300 [6206]
Or from the longer path.
MATCH (from:Tank)<-[:DRAINS]-(l:Transfer)-[:FEEDS]->(to: Tank)<-[:FEEDS]-(l1:Transfer)-[:DRAINS]->(from1:Tank)
WHERE from.community_id = 1
MATCH (l)-[:IS]->(o:Code)
WITH o, collect(distinct from)+collect(distinct to)+collect(from1) as tanks
UNWIND tanks as t
return t.community_id as community, collect(distinct t.name) as equipment, sum(t.flow) as netFlow, sum(o.weight) as opCodeWeight, collect(distinct o.name) as opcodes
1 [6016, 3013, 3063] -200 500 [6206]
03-26-2021 03:14 PM
That looks like its getting what I need!
Essentially, I am grouping all connected Tanks via their bridge Leg nodes. Each Leg is associated with a Code, that is weighted. I am using the Weakly Connected Communities algorithm to identify each of the groups with an ID, and then using that ID to group them in the end output query.
I need to look at each community, and get the sum of community flow, each Tank in the community, and the weight associated with the leg Code. I need to identify flows that are either too high or too low as my end output.
All the sessions of the conference are now available online