cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

find the duplicate nodes

In the sample graph, a few movies have received reviews. This is indicated by a “REVIEWED” relationship between a movie node and a person node. Write a query to find the actors that have played in at least two movies that received at least one review. For each of these actors, return the actor’s name and the title of each movie with at least one review that this actor has played in. Two actors satisfy this condition: one has played in two movies while the other has played in three movies.

1 ACCEPTED SOLUTION

The query matches the person to the movies they acted in and to people that reviewed the movie. It a match is found, that means at least one person reviewed the movie, satisfying the second requirement. Next, the query collects the distinct list of movie titles. Distinct is needed, as multiple people could have reviewed the same movie, resulting in multiple rows with the same movie. Finally, the query filters on the persons that have more than one movie in this list, satisfying the first requirement.  Hope this helps.

match(p:Person)-[:ACTED_IN]-(m:Movie)<-[:REVIEWED]-(:Person)
with p, collect(distinct m.title) as titles
where size(titles) > 1
return p.name, titles

Screen Shot 2022-10-05 at 2.57.22 PM.png

View solution in original post

4 REPLIES 4

The data best is the default database in Neo4j

The query matches the person to the movies they acted in and to people that reviewed the movie. It a match is found, that means at least one person reviewed the movie, satisfying the second requirement. Next, the query collects the distinct list of movie titles. Distinct is needed, as multiple people could have reviewed the same movie, resulting in multiple rows with the same movie. Finally, the query filters on the persons that have more than one movie in this list, satisfying the first requirement.  Hope this helps.

match(p:Person)-[:ACTED_IN]-(m:Movie)<-[:REVIEWED]-(:Person)
with p, collect(distinct m.title) as titles
where size(titles) > 1
return p.name, titles

Screen Shot 2022-10-05 at 2.57.22 PM.png

Thanks it works

I thought it be interesting to figure out how to make it more complicated by requiring the movie to be reviewed by at least 2 people as well. I think this should work. The query now groups the person and movie to get the total number of reviewers for that movie the person acted in. The query then filters out all movies that did not have at least two reviewers.  The rest of the query is the same as above. 

match(p:Person)-[:ACTED_IN]-(m:Movie)<-[:REVIEWED]-(r:Person)
with p, m, collect(distinct r.name) as reviewers
where size(reviewers) > 1
with p, collect(distinct m.title) as titles
where size(titles) > 1
return p.name, titles