cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Filtering Cypher Results

I hope I am not being a bother, I have data in Neo4j and have filtered it as shown on the attached screenshot, I trying to structure the query so that I can identify cases were more than one c.names were at the same location (l.name) during the same month and year. How can I change my query to show such results?

11 REPLIES 11

I am trying to filter for locations where I have more than one collector ID, i.e c.id>1, where would I fit this in the query, all attempts so far have resulted in errors

MATCH (c:Event)-[r:AT_PLACE]->(l:Place) WHERE NOT r.year IS null
WITH DISTINCT l.name as location, r.Year as year, r.Month as month, COUNT(DISTINCT c) AS eventNum WHERE eventNum > 1
RETURN *

is possible to add another line to say where not r.Month is null, or does cypher allow just one such within a query

Sure just use an AND in your WHERE clause:

...
WHERE r.year IS NOT NULL AND r.Month IS NOT NULL
...

I worked my query around and I am now producing results that are making sense, thanks to everyone that helped, however, I have one last question

Is there a way I make the query only return unique rows, for example there are multiple rows showing Jenyns Leonard, Henslow J.S , ELY, 7, 1821 and I only want to maintain one of these and remove duplicates.

The query that I used to get to this result is:
MATCH (c:Collector)-[:COLLECTED_WITH]->(c1:Collector)
WHERE c.name <>c1.name AND c.Year IS NOT NULL AND c.Month IS NOT NULL AND c.name <>"na" AND c1.name <>"na"
WITH DISTINCT c.place as location, c.Year as year, c.Month as month, c.name as Collector1, c1.name AS Collector2
RETURN * ORDER BY year

Are you ok with keeping both c and c1 in one column? If so, you use this:

MATCH (c:Collector)-[:COLLECTED_WITH]->(c1:Collector)
WHERE c.name <>c1.name AND c.Year IS NOT NULL AND c.Month IS NOT NULL AND c.name <>"na" AND c1.name <>"na"
WITH DISTINCT c.place as location, c.Year as year, c.Month as month, apoc.coll.sort([c.name, c1.name]) as Collectors
RETURN * ORDER BY year

Your query does return unique rows, you made sure of that by your usage of DISTINCT in the WITH clause before your return.

Your c and c1 may not be unique, but you can see there are differences in the month and/or year values for the row.

If you want to have unique c1 and c2, then you will need to perform aggregation for some of the remaining values, collecting month and years for example, such that the non-aggregation variables become the distinct grouping key (which will happen automatically when you perform the aggregation).

EDIT

Ah, I see what you mean with the mirrored results for c and c1.

Normally we'd recommend using something like WHERE id(c) < id(c1), but that wouldn't apply to your data, as the :Collector nodes seem to represent both entity and locality and time, and not a distinct entity.

If you drop the direction (this assumes that relationships are not always reciprocal), then you can filter based on the name:

MATCH (c:Collector)-[:COLLECTED_WITH]-(c1:Collector)
WHERE c.name < c1.name ...

When I use

MATCH (c:Collector)-[:COLLECTED_WITH]->(c1:Collector)
WHERE c.name <>c1.name AND c.Year IS NOT NULL AND c.Month IS NOT NULL AND c.name <>"na" AND c1.name <>"na"
WITH DISTINCT c.place as location, c.Year as year, c.Month as month, apoc.coll.sort(c.name, c1.name) as Collectors
RETURN * ORDER BY year

I am getting an error message

I will try and investigate

Put c.name and c1.name in a list and call apoc.coll.sort([c.name, c1.name]).
Sorry I missed that myself in my answer.

can I be able to call to DISTINCTS within a statement, I think that the query below will give me the result I need if it works, however, I need to only retain the distinct Collabo and the query work to the put where I add the second distinct ..

MATCH (c:Collector)-[r:VISITED]->(p:Place) WHERE NOT p.name='na'AND NOT c.name='na' AND NOT r.Month IS null
WITH c,p,r
WITH DISTINCT p.name as location, r.Year as year, r.Month as month, c.name as Collector, apoc.coll.sort([c.name, p.name]) as Collabo, WHERE DISTINCT Collbo,
RETURN * ORDER BY year

DISTINCT is for the whole row. You don’t need an extra WHERE clause.

MATCH (c:Collector)-[r:VISITED]->(p:Place) WHERE NOT p.name='na'AND NOT c.name='na' AND NOT r.Month IS null
WITH c,p,r
WITH DISTINCT p.name as location, r.Year as year, r.Month as month, c.name as Collector, apoc.coll.sort([c.name, p.name]) as Collabo 
RETURN * ORDER BY year