Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-13-2022 07:12 AM - edited 07-29-2022 08:43 AM
Let's see what we can learn about movies for the first weekly challenge. Specifically, movies that contain "Life" in the title. Is there a genre or director you think came up with the best movie on this list? Let us know!
Here is what we want you to do:
Create a recommendations sandbox at sandbox.neo4j.com.
For this sandbox, write a query that returns one row for every movie in the graph that contains "Life" in its title.
Use a parameter named testString with a value of "Life" for the predicate.
The rows will be returned in movie year order.
Each row will contain:
Please post your solution code in your response and a screenshot of the solution results. We will be giving away a T-shirt to one of the submissions between 7-29-2022 and 8-5-2022, submit your solution to be included in the drawing!
Solved! Go to Solution.
08-09-2022 02:16 AM - edited 08-09-2022 02:19 AM
Here is my new query:
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
WITH m
MATCH (m)<-[:DIRECTED]-(d)
WITH m, collect(d.name) AS directors
MATCH (m)-[:IN_GENRE]->(g)
RETURN m.title AS title, m.year AS year, directors, collect(g.name) AS genres
ORDER BY year
Regards,
Cobra
08-07-2022 11:55 PM
Hello @TrevorS 😊
Here is my query:
:param testString => 'Life';
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
RETURN m.title AS title, m.year AS year, [(m)-[:IN_GENRE]->(g) | g.name] AS genres, [(m)<-[:DIRECTED]-(d) | d.name] AS directors
ORDER BY year;
Here is the screenshot:
Regards,
Cobra
08-08-2022 08:52 AM
@Cobra it is a great query. The solution is one where the elapsed ms goes down to as low as 26 ms
08-08-2022 09:10 AM
I see, do you have any hints?
08-08-2022 09:46 AM
I was able to reduce to 18ms with this query:
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
RETURN m.title AS title, m.year AS year, [(m)-[:IN_GENRE]->(g) | g.name] AS genres, [(m)<-[:DIRECTED]-(d:Director) | d.name] AS directors
ORDER BY year;
Do you have db hits info?
Regards,
Cobra
08-08-2022 10:30 AM
Your query returns 81 rows because there is a row with no directors. I can get your query down to 23 ms. Total db hits of 2518.
The solution I have has 17ms with total db hits of 2499.
Did you add any indexes to the graph? You should not.
08-09-2022 01:46 AM
The query must return movies who have at least 1 director and 1 genre?
What is the expected number of rows for the query: 80, 81 or other?
No, I didn't add any indexes, I just relaunched multiple times the query 😊
08-09-2022 02:16 AM - edited 08-09-2022 02:19 AM
Here is my new query:
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
WITH m
MATCH (m)<-[:DIRECTED]-(d)
WITH m, collect(d.name) AS directors
MATCH (m)-[:IN_GENRE]->(g)
RETURN m.title AS title, m.year AS year, directors, collect(g.name) AS genres
ORDER BY year
Regards,
Cobra
08-09-2022 05:10 AM
Great job @Cobra !
Your query performed as well as the solution that we had which is:
PROFILE
MATCH (movie:Movie)-[:IN_GENRE]->(g)
WHERE movie.title CONTAINS $testString
WITH movie, collect(g.name) AS genres
MATCH (movie)<-[:DIRECTED]-(director)
WITH movie, genres, collect(director.name) AS directors
RETURN movie.title AS Movie, movie.year AS Year, genres AS Genres, directors AS Directors ORDER BY movie.year
11-29-2022 08:43 PM
Hi!
I just gave this query a go and while I got an answer quickly I was struggling to understand why I was getting higher DB hits than both of you (not by much though). I widdled it down to this specific difference in the example below
Using just (g) vs (g:Genre) results in fewer DB hits (1325 vs 1482). I would have expected the answers to be the same as there aren't any [:IN_GENRE] relationships with anything other than a genre right? I tried to validate this assumption by using the following where I replaced (g:Genre) with (g), both returned the same 20340 count but using (g) results in 228475 db hits where (g:Genre) results in 20400. I'm now left even more confused because this would imply g:genre is the more efficient method, so why is it worse in the weekly challenge query?
Appreciate any insight!
Thanks,
Josh
All the sessions of the conference are now available online