Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-05-2018 08:51 PM
Hi All,
I am working on 1 Use case where we are merging all nodes(with 1 common property, such as all nodes with year= "2016" in Movie.csv table below) into 1 node where all the relationships are heading towards it rather than 3 different nodes.
For example:-
I have 3 csv files with below type of data.
Actor.csv:-
Actor Id | Name |
---|---|
1 | Sam |
Movie.csv:-
Movie Id | Movie Name | Year | Actor Id | Director Id |
---|---|---|---|---|
45 | Avengers | 2016 | 1 | 10 |
23 | Movie 2 | 2016 | 1 | 10 |
12 | Movie 3 | 2016 | 1 | 10 |
Director.csv:-
Director ID | Director Name |
---|---|
10 | Danny Morgan |
Now I merged all the Movie nodes into 1 node using apoc.refractor.mergenodes
from APOC library.
Now when I request the data as a TABLE (not Graph) like:-
Match (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director)
return a.Name, m.Movie_Name,d.Director_Name
Ideally I should get only 3 rows with above mentioned info.
But I get more rows than I am supposed to.
Please help me identify what is the reason and how to fix it.
(I think the issue is that each actor to movie realtionship is giving cartesian product with other 3 relationships of movie to director)
NOTE:-I am using Neo4j Browser version 3.1.4
Thank you
Solved! Go to Solution.
12-06-2018 08:55 AM
You're correct, if the merge resulted in multiple relationships between the same pairs of nodes (so 3 :ACTED_IN relationships between the movie node and same actor, and 3 :DIRECTED_BY relationships between the movie and same director) you'll end up with 9 rows.
We have a kb article on understanding cardinality in Cypher which covers this.
The quick fix is to get DISTINCT nodes before you return:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director)
WITH DISTINCT a, m, d
RETURN a.Name, m.Movie_Name,d.Director_Name
The better fix is to delete the duplicate relationships, since I don't think you need those in your graph:
MATCH (a:Actor)-[r:ACTED_IN]->(m:Movie)
WITH a, m, tail(collect(r)) as toDelete
WHERE size(toDelete) > 0
UNWIND toDelete
DELETE toDelete
and then a similar one for :DIRECTED_BY.
12-06-2018 08:55 AM
You're correct, if the merge resulted in multiple relationships between the same pairs of nodes (so 3 :ACTED_IN relationships between the movie node and same actor, and 3 :DIRECTED_BY relationships between the movie and same director) you'll end up with 9 rows.
We have a kb article on understanding cardinality in Cypher which covers this.
The quick fix is to get DISTINCT nodes before you return:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)-[:DIRECTED_BY]->(d:Director)
WITH DISTINCT a, m, d
RETURN a.Name, m.Movie_Name,d.Director_Name
The better fix is to delete the duplicate relationships, since I don't think you need those in your graph:
MATCH (a:Actor)-[r:ACTED_IN]->(m:Movie)
WITH a, m, tail(collect(r)) as toDelete
WHERE size(toDelete) > 0
UNWIND toDelete
DELETE toDelete
and then a similar one for :DIRECTED_BY.
All the sessions of the conference are now available online