Neo4j

TonyOD27 · ‎11-20-2020

Hello,

How would I count the number of relationships per each group of nodes? Here's a proxy of my data model:

The result that I want here is the table:
|City| Total Stores| First_Access| Second_Access| Third_Access|
|London|89|100|30|50|
|New York City|74|200|50|87|
|Munich|60|450|67|32|
|Chicago|50|239|58|25|
|Paris|30|140|31|32|
|Tokyo|20|20|22|12|
|Beijing|10|30|10|8|

I tried this Cypher query but it doesn't quite work:

MATCH(c:City)<-[:LOCATED_IN]-(s:Store)
RETURN DISTINCT
		c.name as City,
		count(DISTINCT s) AS `Total Stores`,
		size((s)-[:FIRST_ACCESS]->()) AS First_Access,
		size((s)-[:SECOND_ACCESS]->()) AS Second_Access,
		size((s)-[:THIRD_ACCESS]->()) AS Third_Access
ORDER BY `Total Stores` DESC

This gives me duplicated rows like this:

|City| Total Stores| First_Access| Second_Access| Third_Access|
|London|49|0|0|0|
|London|49|100|30|50|
|New York City|74|200|50|87|
|Munich|30|0|0|0|
|Munich|30|450|67|32|
|Chicago|50|239|58|25|
|Paris|30|140|31|32|
|Tokyo|20|20|22|12|
|Beijing|10|0|0|8|
|Beijing|0|30|10|0|

Then I tried the following query:

MATCH(c:City)<-[:LOCATED_IN]-(s:Store)
RETURN DISTINCT
		c.name as City,
		count(DISTINCT s) AS `Total Stores`,
		size(collect((s)-[:FIRST_ACCESS]->())) AS First_Access,
		size(collect((s)-[:SECOND_ACCESS]->())) AS Second_Access,
		size(collect((s)-[:THIRD_ACCESS]->())) AS Third_Access
ORDER BY `Total Stores` DESC

But this cypher query brings my laptop to a crawl and I'm waiting hours for this to finish executing. Is anyone aware of an apoc procedure that can help with this? Or is there a better way to write this query? I'm reading through the cypher cyntax guide and apoc docs, but I can't find what I need.

Here are my neo4j specifications:

dbms.memory.heap.initial_size=5G
dbms.memory.heap.max_size=5G
dbms.memory.pagecache.size=7G

neo4j version: Community 4.2.0
desktop version: 1.3.11

michael_hunger · ‎11-21-2020

You first statement was already ok, you just need to use an aggregation function too for your size() columns.

I.e. sum(size((s)-[:FIRST_ACCESS]->())) as firstAccess

View solution in original post

ameyasoft · ‎11-20-2020

The .csv file as the delimiter, '|' in the beginning and at the end of each row. 

This is not right and this may be causing you some issues. Also, some column headers have blank spaces and this I have corrected in many posts posted here.

I used your data (without first and last '|'). Here are the results:

City| Total Stores| First_Access|Second_Access| Third_Access
London|89|100|30|50
New York City|74|200|50|87
Munich|60|450|67|32
Chicago|50|239|58|25
Paris|30|140|31|32
Tokyo|20|20|22|12
Beijing|10|30|10|8

LOAD CSV WITH HEADERS FROM "file:///tony.csv" AS row FIELDTERMINATOR '|'
with row

merge (a:City {name: row.City})
merge (b:Store {total: toInteger(row.` Total Stores`)})
merge (c:FirstLaw {total1: toInteger(row.` First_Access`)})
merge (c1:SecondLaw {total2: toInteger(row.`Second_Access`)})
merge (c2:ThirdLaw {total3: toInteger(row.` Third_Access`)})

merge (b)-[:LOCATED_IN]->(a)
merge (b)-[:FIRST_ACCESS]->(c)
merge (b)-[:SECOND_ACCESS]->(c1)
merge (b)-[:THIRD_ACCESS]->(c2);

Result:

MATCH(c:City)<-[:LOCATED_IN]-(s:Store)
match (s)-[:FIRST_ACCESS]-(d1)
match (s)-[:SECOND_ACCESS]-(d2)
match (s)-[:THIRD_ACCESS]-(d3)
RETURN DISTINCT
		c.name as City,
		count(DISTINCT s) AS `Total Stores`,
		d1.total1 AS First_Access,
		d2.total2 AS Second_Access,
		d3.total3 AS Third_Access
ORDER BY `Total Stores` DESC

Result:

michael_hunger · ‎11-21-2020

You first statement was already ok, you just need to use an aggregation function too for your size() columns.

I.e. sum(size((s)-[:FIRST_ACCESS]->())) as firstAccess

Neo4j

Counting Number of Relationships Per Grouped Nodes