cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4j returning incorrect number of rows

Hi All,

I am working on 1 Use case where we are merging all nodes(with 1 common property, such as all nodes with year= "2016" in Movie.csv table below) into 1 node where all the relationships are heading towards it rather than 3 different nodes.

For example:-
I have 3 csv files with below type of data.

Actor.csv:-

Actor Id Name
1 Sam

Movie.csv:-

Movie Id Movie Name Year Actor Id Director Id
45 Avengers 2016 1 10
23 Movie 2 2016 1 10
12 Movie 3 2016 1 10

Director.csv:-

Director ID Director Name
10 Danny Morgan

Now I merged all the Movie nodes into 1 node using apoc.refractor.mergenodes from APOC library.

Now when I request the data as a TABLE (not Graph) like:-

Match (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director)
return a.Name, m.Movie_Name,d.Director_Name

Ideally I should get only 3 rows with above mentioned info.
But I get more rows than I am supposed to.

Please help me identify what is the reason and how to fix it.

(I think the issue is that each actor to movie realtionship is giving cartesian product with other 3 relationships of movie to director)
NOTE:-I am using Neo4j Browser version 3.1.4

Thank you

1 ACCEPTED SOLUTION

You're correct, if the merge resulted in multiple relationships between the same pairs of nodes (so 3 :ACTED_IN relationships between the movie node and same actor, and 3 :DIRECTED_BY relationships between the movie and same director) you'll end up with 9 rows.

We have a kb article on understanding cardinality in Cypher which covers this.

The quick fix is to get DISTINCT nodes before you return:

MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director) 
WITH DISTINCT a, m, d
RETURN a.Name, m.Movie_Name,d.Director_Name

The better fix is to delete the duplicate relationships, since I don't think you need those in your graph:

MATCH (a:Actor)-[r:ACTED_IN]->(m:Movie)
WITH a, m, tail(collect(r)) as toDelete
WHERE size(toDelete) > 0
UNWIND toDelete
DELETE toDelete

and then a similar one for :DIRECTED_BY.

View solution in original post

1 REPLY 1

You're correct, if the merge resulted in multiple relationships between the same pairs of nodes (so 3 :ACTED_IN relationships between the movie node and same actor, and 3 :DIRECTED_BY relationships between the movie and same director) you'll end up with 9 rows.

We have a kb article on understanding cardinality in Cypher which covers this.

The quick fix is to get DISTINCT nodes before you return:

MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director) 
WITH DISTINCT a, m, d
RETURN a.Name, m.Movie_Name,d.Director_Name

The better fix is to delete the duplicate relationships, since I don't think you need those in your graph:

MATCH (a:Actor)-[r:ACTED_IN]->(m:Movie)
WITH a, m, tail(collect(r)) as toDelete
WHERE size(toDelete) > 0
UNWIND toDelete
DELETE toDelete

and then a similar one for :DIRECTED_BY.