Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-02-2021 03:48 PM
Hi, I'm trying to build v1 of a query (pagination and other stuff will come later) that will generate a social media timeline for a project I'm working on.
Here is the query I currently have. Right now, I'm able to get the most recent user that liked the site. However, like the title says, I want the most recent 3 users.
I'm having trouble wrapping my head around how collect/unwind work. Specifically, how does the apoc.agg.maxItems know to only look at the LIKED_SITE relationships that are tied to the current site it's looking at? If I do a collect on the ls inside that WITH, how does it know to group them based off the site they're pointing to? I thought I understood how this worked, but as soon as I UNWIND the collections to perform operations on them, I can't collect them back into lists for each site.
Sorry if this is confusing... Just having a hard time wrapping my head around how neo4j/cypher work. Would really appreciate just any general advice as well.
MATCH (u:User {id: $cypherParams.userId})-[:IS]->(pu:PublicUser)-[:FOLLOWS]->(following_pu:PublicUser)-[:HAD_SESSION]->(sesh:Session)-[ls:LIKED_SITE]->(site:Site)
WITH
site,
duration.inSeconds(max(ls.lastLikedOn), datetime()) AS dur,
max(ls.lastLikedOn) AS lastLikedOn,
apoc.agg.maxItems(ls, ls.lastLikedOn, 1) as most_recent_LIKED_SITE
MATCH (pu:PublicUser)-[:HAD_SESSION]->(:Session)-[:LIKED_SITE {id: most_recent_LIKED_SITE.items[0].id}]->(:Site)
RETURN {
url: site.url,
siteName: site.siteName,
title: site.title,
description: site.description,
image: site.image,
lastLikedBy: pu.username,
lastLikedOn: lastLikedOn,
secondsSinceLastLike: dur.seconds
} ORDER BY dur
PS: This will generate the same dataset that I have
CREATE (pu1:PublicUser) set pu1.username = "1"
CREATE (u1)-[:IS]->(pu1)
CREATE (u2:User) SET u2.username = "2", u2.id = randomUUID()
CREATE (pu2:PublicUser) set pu2.username = "2"
CREATE (u2)-[:IS]->(pu2)
CREATE (u3:User) SET u3.username = "3", u2.id = randomUUID()
CREATE (pu3:PublicUser) set pu3.username = "3"
CREATE (u3)-[:IS]->(pu3)
CREATE (u4:User) SET u4.username = "4", u4.id = randomUUID()
CREATE (pu4:PublicUser) set pu4.username = "4"
CREATE (u4)-[:IS]->(pu4)
CREATE (u5:User) SET u5.username = "5", u5.id = randomUUID()
CREATE (pu5:PublicUser) set pu5.username = "5"
CREATE (u5)-[:IS]->(pu5)
CREATE (jim:User) SET jim.username = "jim", jim.id = randomUUID()
CREATE (pujim:PublicUser) set pujim.username = "jim"
CREATE (jim)-[:IS]->(pujim)
with u5 as poop
MATCH (pu:PublicUser {username: "1"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "twitter.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:12, minute:31, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
WITH pu as poop
MATCH (pu:PublicUser {username: "2"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "twitter.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:13, minute:31, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
WITH pu as poop
MATCH (pu:PublicUser {username: "3"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "twitter.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:14, minute:31, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
WITH pu as poop
MATCH (pu:PublicUser {username: "3"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "nastyboys.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:14, minute:51, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
WITH pu as poop
MATCH (pu:PublicUser {username: "4"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "twitter.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:15, minute:31, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
WITH pu as poop
MATCH (pu:PublicUser {username: "4"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "pornhub.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:16, minute:31, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
with pu as poop
MATCH (pu:PublicUser {username: "5"})
MERGE (pu)-[:HAD_SESSION]->(sesh:Session {date: date({year: 2017, month: 11, day: 11})})
MERGE (site:Site {url: "twitter.com"})
MERGE (sesh)-[r:LIKED_SITE]->(site)
ON CREATE
SET
r.id = randomUUID(),
r.timesLiked = 1,
r.lastLikedOn = datetime({year:2017, month:11, day:11, hour:11, minute:31, second:14, nanosecond: 645876123, timezone: '+01:00'})
ON MATCH
SET
r.timesLiked = r.timesLiked + 1,
r.lastLikedOn = datetime()
return pu as poop
MATCH (pu:PublicUser {username: "jim"})
WITH pu
MATCH(u1:PublicUser{ username: "1"})
MERGE (pu)-[:FOLLOWS]->(u1)
WITH pu
MATCH(u2:PublicUser{ username: "2"})
MERGE (pu)-[:FOLLOWS]->(u2)
WITH pu
MATCH(u3:PublicUser{ username: "3"})
MERGE (pu)-[:FOLLOWS]->(u3)
WITH pu
MATCH(u4:PublicUser{ username: "4"})
MERGE (pu)-[:FOLLOWS]->(u4)
WITH pu
MATCH(u5:PublicUser{ username: "5"})
MERGE (pu)-[:FOLLOWS]->(u5)
WITH pu
return pu
type or paste code here
Solved! Go to Solution.
06-02-2021 04:59 PM
I can help clarify some things here.
Aggregations in Cypher use the non-aggregation terms as their grouping key, which provides the context for the aggregation (there are some exceptions to this, but they don't apply here). This applies to both built-in aggregation functions as well as custom ones, such as apoc.agg.maxItems().
Expressions that are based on an aggregation are tied to the aggregation, but to avoid more complex situations it's best to keep all elements of an aggregation as simple as possible.
In your WITH clause, the only non-aggregation term is site
(since the expression for dur
builds off the max() aggregation term), so site
becomes the grouping key: per distinct site, the aggregations will be applied.
In order to get the last 3 who liked each site, we can no longer use the max() or maxItems() aggregations, since those will only get the last liked relationship. Instead, we need to use ORDER BY to get the last 3 :LIKED_SITE relationships based on the lastLikedOn relationship, and then based on the limited result set, get the rest of what we want.
While there are several ways to do this, here's one approach:
MATCH (u:User {id: $cypherParams.userId})-[:IS]->(pu:PublicUser)-[:FOLLOWS]->(following_pu:PublicUser)-[:HAD_SESSION]->(sesh:Session)-[ls:LIKED_SITE]->(site:Site)
WITH site, ls
ORDER BY ls.lastLikedOn DESC
WITH site, collect(ls)[..3] as last3
CALL {
WITH last3
UNWIND last3 as liked
MATCH (pu:PublicUser)-[:HAD_SESSION]->(:Session)-[liked]->()
WITH pu, pu.username as lastLikedBy, liked.lastLikedOn as lastLikedOn, duration.inSeconds(liked.lastLikedOn, dateTime()).seconds as secondsSinceLastLike
ORDER BY secondsSinceLastLike
RETURN collect(pu {lastLikedBy, lastLikedOn, secondsSinceLastLike}) as last3Likes
}
RETURN site {.url, .siteName, .title, .description, .image, last3Likes} as siteInfo
We use ORDER BY followed by a collect() with a slice of that list to get the last 3 liked relationships per site.
We then use a subquery (executed per site) to work with each site's top 3 results, MATCHing back to the user who originated the like, and collecting the relevant info into a list (since the subquery is executing per site, that forms the scope, and this ensures that the collect() is with respect to the site, even if there is no grouping key explicitly present).
We then use map projection to get the properties we want from the site, including the list of the last 3 likes.
06-02-2021 04:59 PM
I can help clarify some things here.
Aggregations in Cypher use the non-aggregation terms as their grouping key, which provides the context for the aggregation (there are some exceptions to this, but they don't apply here). This applies to both built-in aggregation functions as well as custom ones, such as apoc.agg.maxItems().
Expressions that are based on an aggregation are tied to the aggregation, but to avoid more complex situations it's best to keep all elements of an aggregation as simple as possible.
In your WITH clause, the only non-aggregation term is site
(since the expression for dur
builds off the max() aggregation term), so site
becomes the grouping key: per distinct site, the aggregations will be applied.
In order to get the last 3 who liked each site, we can no longer use the max() or maxItems() aggregations, since those will only get the last liked relationship. Instead, we need to use ORDER BY to get the last 3 :LIKED_SITE relationships based on the lastLikedOn relationship, and then based on the limited result set, get the rest of what we want.
While there are several ways to do this, here's one approach:
MATCH (u:User {id: $cypherParams.userId})-[:IS]->(pu:PublicUser)-[:FOLLOWS]->(following_pu:PublicUser)-[:HAD_SESSION]->(sesh:Session)-[ls:LIKED_SITE]->(site:Site)
WITH site, ls
ORDER BY ls.lastLikedOn DESC
WITH site, collect(ls)[..3] as last3
CALL {
WITH last3
UNWIND last3 as liked
MATCH (pu:PublicUser)-[:HAD_SESSION]->(:Session)-[liked]->()
WITH pu, pu.username as lastLikedBy, liked.lastLikedOn as lastLikedOn, duration.inSeconds(liked.lastLikedOn, dateTime()).seconds as secondsSinceLastLike
ORDER BY secondsSinceLastLike
RETURN collect(pu {lastLikedBy, lastLikedOn, secondsSinceLastLike}) as last3Likes
}
RETURN site {.url, .siteName, .title, .description, .image, last3Likes} as siteInfo
We use ORDER BY followed by a collect() with a slice of that list to get the last 3 liked relationships per site.
We then use a subquery (executed per site) to work with each site's top 3 results, MATCHing back to the user who originated the like, and collecting the relevant info into a list (since the subquery is executing per site, that forms the scope, and this ensures that the collect() is with respect to the site, even if there is no grouping key explicitly present).
We then use map projection to get the properties we want from the site, including the list of the last 3 likes.
06-02-2021 05:31 PM
Andrew, thank you so much! I have a much better understanding of how this all works now. Using a subquery was the final piece of the puzzle.
Thanks for taking the time to explain how the grouping keys work with aggregations.
EDIT: just realized some of the stuff I put in that dummy dataset....
06-02-2021 05:41 PM
1 more question:
Would you mind explaining (or linking somewhere) how the "collect(pu {lastLikedBy, lastLikedOn, secondsSinceLastLike})" works? I don't think I've seen that syntax before. Specifically, I'm not sure what the pu does (although I do know that taking it out seems to break it, haha)
06-02-2021 06:20 PM
Sure, so this is a neat little feature called map projection, which lets us more easily select the properties we want for our map. When the map is based upon another map, or upon a node's properties, we can refer to existing properties and it will do the work of using those properties for the key and value. If we include a variable, then the variable name will become the key and the variable's value becomes the value. We can also add our own custom key/values the way we can with map literals.
Documentation here:
06-02-2021 06:23 PM
That is neat. Thank you!
All the sessions of the conference are now available online