Neo4j

groverjatin17 · ‎12-05-2018

Hi All,

I am working on 1 Use case where we are merging all nodes(with 1 common property, such as all nodes with year= "2016" in Movie.csv table below) into 1 node where all the relationships are heading towards it rather than 3 different nodes.

For example:-
I have 3 csv files with below type of data.

Actor.csv:-

Actor Id	Name
1	Sam

Movie.csv:-

Movie Id	Movie Name	Year	Actor Id	Director Id
45	Avengers	2016	1	10
23	Movie 2	2016	1	10
12	Movie 3	2016	1	10

Director.csv:-

Director ID	Director Name
10	Danny Morgan

Now I merged all the Movie nodes into 1 node using apoc.refractor.mergenodes from APOC library.

Now when I request the data as a TABLE (not Graph) like:-

Match (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director)
return a.Name, m.Movie_Name,d.Director_Name

Ideally I should get only 3 rows with above mentioned info.
But I get more rows than I am supposed to.

Please help me identify what is the reason and how to fix it.

(I think the issue is that each actor to movie realtionship is giving cartesian product with other 3 relationships of movie to director)
NOTE:-I am using Neo4j Browser version 3.1.4

Thank you

andrew_bowman · ‎12-06-2018

You're correct, if the merge resulted in multiple relationships between the same pairs of nodes (so 3 :ACTED_IN relationships between the movie node and same actor, and 3 :DIRECTED_BY relationships between the movie and same director) you'll end up with 9 rows.

We have a kb article on understanding cardinality in Cypher which covers this.

The quick fix is to get DISTINCT nodes before you return:

MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director) 
WITH DISTINCT a, m, d
RETURN a.Name, m.Movie_Name,d.Director_Name

The better fix is to delete the duplicate relationships, since I don't think you need those in your graph:

MATCH (a:Actor)-[r:ACTED_IN]->(m:Movie)
WITH a, m, tail(collect(r)) as toDelete
WHERE size(toDelete) > 0
UNWIND toDelete
DELETE toDelete

and then a similar one for :DIRECTED_BY.

View solution in original post

andrew_bowman · ‎12-06-2018