cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Return edgelist for subgraph efficiently

JaHo
Node Clone

Hello everyone,

I would like to return a subgraph in edgelist form for a relationship (:CITES) linking nodes of the same label (Family) where nodes are filtered by several conditions.

I have a query which mostly does what I want but seems somewhat inefficient as it duplicates the filter steps:

MATCH (sc:SubclassCPC)<-[:CLASSIFIED_AS]-(a1:Application)-[:BELONGS_TO]->(f1:Family)
MATCH (sc:SubclassCPC)<-[:CLASSIFIED_AS]-(a2:Application)-[:BELONGS_TO]->(f2:Family)
MATCH (a1)-[:FILED_IN]-(c:Country)
MATCH (a2)-[:FILED_IN]-(c:Country)
WHERE 
   sc.code IN ["D01F"] AND 
   a1.granted = true AND a2.granted = true AND
   a1.filing_year >= 1990 AND a2.filing_year >= 1990 AND
   a1.filing_year <= 2014 AND a2.filing_year <= 2014 AND
   c.code IN ["EU", "US"]
MATCH (f1)-[:CITES]->(f2)
RETURN DISTINCT
    f1.family_id AS citing_id, 
    f2.family_id AS cited_id

I'm sure it is possible to filter families once and then return all the :CITES relationships among the filtered set of families (in edgelist form) but I couldn't find the syntax. Does anyone have any suggestions?

5 REPLIES 5

ameyasoft
Graph Maven
Try this:
MATCH (sc:SubclassCPC)<-[:CLASSIFIED_AS]-(a1:Application)-[:FILED_IN]-(c:Country)
WHERE 
   sc.code IN ["D01F"] AND 
   a1.granted = true AND
   a1.filing_year >= 1990 AND
   a1.filing_year <= 2014 AND-
   c.code IN ["EU", "US"]
   
 WITH a1
 
 MATCH (a1)-[:BELONGS_TO]->(f1:Family)-[:CITES]->(f2:Family)
 RETURN DISTINCT
    f1.family_id AS citing_id, 
    f2.family_id AS cited_id

Now that you typed it, it seems so obvious Thanks a lot, this worked like a charm!

On a second note, doesn't this mean that only the citing families (f1) are filtered and the returned cited families (f2) also contain families to which the filters do not apply?

If the Family nodes also have filing_year, country code properties, then you can add those filters before RETURN statement.

The problem is that the filtering is not done on the families directly but on the applications. I would then like to extract the families to which these filtered applications belong. And finally I would like to obtain all citing relationships between these families such that both the citing and the cited family are from the set of families indirectly filtered through the filters on the applications. Thats why my original code duplicated much of the filtering with variables a1 and a2.