Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-13-2020 08:28 AM
I have a database with articles and tags with relations (tag)<-[:CONTAINS]-(article)
. I would like to find the most popular tags connected to a certain topic. For example, what are the 10 most popular tags that are used such that at least one connected article contains the text "python"? I have tried the following query, but I seem to be getting multiple results:
Match (t:Tag)<-[:CONTAINS]-(a:Article)
Where a.text Contains "python"
With t, Size((t)<-[:CONTAINS]-()) as s
Return t.tag, s
Order by s DESC Limit 10
What's the right way to do this?
01-13-2020 01:14 PM
In your query, you first find all tags of a certain topic, but then you find the most popular ones across all topics. You also need to use distinct
.
Try this:
Match (t:Tag)<-[:CONTAINS]-(a:Article)
Where a.text Contains "python"
With distinct t as t, count(distinct a) as s
Return t.tag, s
Order by s DESC Limit 10
01-13-2020 01:32 PM
That can work, though that query will only give you the count of articles containing 'python'. Also the WITH distinct
isn't needed, WITH
alone will work, since when you aggregate, the non-aggregation variables (the grouping key) becomes distinct automatically (so t
will become distinct because of the count()
aggregation).
For getting the degree of your tag nodes, you WILL need to use the distinct keyword though, before you get the degree:
Match (t:Tag)<-[:CONTAINS]-(a:Article)
Where a.text Contains "python"
With distinct t
With t, Size((t)<-[:CONTAINS]-()) as s
Order by s DESC Limit 10
Return t.tag, s
All the sessions of the conference are now available online