cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Pairs of Actors with their Movies where they acted in more than one movie together

I'm just getting familiar with Neo4j/Cypher and have been playing with the provided example graphs in the browser.

How can I return pairs of actors and their common movies when they have acted in at least two movies together?

The graph for this seems to look OK ...

match (p1:Person)-[:ACTED_IN]->(m1:Movie)<-[:ACTED_IN]-(p2:Person)-[:ACTED_IN]->(m2:Movie)<-[:ACTED_IN]-(p1)
return p1,m1,m2,p2

... but I see duplication in the text result.

I'd be very interested to see alternative solutions with comments on how path uniqueness plays a role and comments on their relative merit/performance.

1 ACCEPTED SOLUTION

Hi Rick, for this one it will be easier to make the pattern simpler, using just one movie variable, then collect the movies in common for each pair of people and filter by the size of the common movies collection.

We can also add in a predicate to ensure we only consider a pair of actors only once, rather than multiple times with swapped variables.

MATCH (p1)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(p2)
WHERE id(p1) < id(p2)
WITH p1, p2, collect(m) as commonMovies
WHERE size(commonMovies) >= 2
RETURN p1, p2, commonMovies

View solution in original post

3 REPLIES 3

Here's another attempt. I think it can be better.

match (p1:Person)-[:ACTED_IN]->(m1:Movie)<-[:ACTED_IN]-(p2:Person)-[:ACTED_IN]->(m2:Movie)<-[:ACTED_IN]-(p1)
with p1, p2
match (p1)-[r1:ACTED_IN]->(m1:Movie)<-[r2:ACTED_IN]-(p2)
where id(r1) < id(r2)
with p1, p2, collect(distinct m1.title) as commonMovies
return p1.name, p2.name, commonMovies
order by p1.name, p2.name

Hi Rick, for this one it will be easier to make the pattern simpler, using just one movie variable, then collect the movies in common for each pair of people and filter by the size of the common movies collection.

We can also add in a predicate to ensure we only consider a pair of actors only once, rather than multiple times with swapped variables.

MATCH (p1)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(p2)
WHERE id(p1) < id(p2)
WITH p1, p2, collect(m) as commonMovies
WHERE size(commonMovies) >= 2
RETURN p1, p2, commonMovies

Interesting, thanks!