cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

expandConfig - Exclude nodes with specific connections

Hi folks.

TLDR: Is it possible with APOC to include a node in labelFilter but also exclude instances of that node type that also connect to another type of node?

I have a schema that describes the relationships between projects, operations and materials. I want to write a query that finds finalised materials that 'belong to' a particular project. This seems simple but is complicated by the fact that sometimes intermediate materials are 're-used' by secondary projects to produce more outputs and I want to exclude these from the material results of the initial project.

A Simplified example of my problem can be created with the following:

create (p1:PROJECT:EXAMPLE{name:'p1'})
with p1
create (p2:PROJECT:EXAMPLE{name:'p2'})
with p1, p2
create (p1)<-[:BELONGS_TO]-(o1:OPERATION:EXAMPLE{name:'o1'})-[:OUT]->(m1:MATERIAL:EXAMPLE{name:'m1'})-[:IN]->(o2:OPERATION:EXAMPLE{name:'o2'})-[:OUT]->(m2:MATERIAL:EXAMPLE{name:'m2'})-[:IN]->(o3:OPERATION:EXAMPLE{name:'o3'})-[:OUT]->(m3:MATERIAL:USEFUL:EXAMPLE{name:'m3'})
create (p2)<-[:BELONGS_TO]-(o4:OPERATION:EXAMPLE{name:'o4'})<-[:IN]-(m2)
create (o4)-[:OUT]->(m4:MATERIAL:USEFUL:EXAMPLE{name:'m4'})

...Looks like this...

Note that Operations produce materials and also belong to projects. Note also that Material 'm2' has been re-used by Project 'p2' to produce material 'm4'

My first attempt at the query just uses apoc expandConfig...

match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>', 
labelFilter:'OPERATION|MATERIAL|>USEFUL',
uniqueness: 'NODE_GLOBAL'
}) yield path
with last(nodes(path)) as usefulMaterials
return usefulMaterials.name as name

This does return 'm3' but also incorrectly returns the material that results from project 'p2's re-use 'm4'

My Second attempt uses a blacklistNodes approach to exclude a pre-matched set of nodes that belong to projects. .....

match (o:OPERATION)-[:BELONGS_TO]->(p:PROJECT)
with collect(o) AS projectOps
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>', 
labelFilter:'OPERATION|MATERIAL|>USEFUL',
uniqueness: 'NODE_GLOBAL',
blacklistNodes: projectOps
}) yield path
with last(nodes(path)) as usefulMaterials
return usefulMaterials.name as name

This approach correctly returns only "m3' for 'p1' and if I swap to 'p2', it correctly returns 'm4'

So it works, but seems very inefficient. As my DB grows larger, the set of projectOps nodes is going to get very large.

Is it possible with APOC to include the OPERATION node in labelFilter but also exclude OPERATION nodes that also connect to a project i.e. exclude

(o:OPERATION)-[:BELONGS_TO]->(p:PROJECT)
1 ACCEPTED SOLUTION

I have had another crack at this. It's not elegant but I can use 2 APOC traversals as shown below. This should not blow out as the db gets larger.. Still would like a more concise solution though

//First traversal to find child ops that belong to other projects
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>', 
	labelFilter:'>OPERATION|MATERIAL',
	uniqueness: 'NODE_GLOBAL'
	}) yield path
with last(nodes(path)) as o
match(o)-[:BELONGS_TO]->(pi:PROJECT)
with collect(o) AS projectOps

//Second traversal avoiding 'owned' ops
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>', 
	labelFilter:'OPERATION|MATERIAL|>USEFUL',
	uniqueness: 'NODE_GLOBAL',
	blacklistNodes: projectOps
	}) yield path
with last(nodes(path)) as usefulMaterials
return usefulMaterials.name as name

View solution in original post

4 REPLIES 4

I have had another crack at this. It's not elegant but I can use 2 APOC traversals as shown below. This should not blow out as the db gets larger.. Still would like a more concise solution though

//First traversal to find child ops that belong to other projects
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>', 
	labelFilter:'>OPERATION|MATERIAL',
	uniqueness: 'NODE_GLOBAL'
	}) yield path
with last(nodes(path)) as o
match(o)-[:BELONGS_TO]->(pi:PROJECT)
with collect(o) AS projectOps

//Second traversal avoiding 'owned' ops
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>', 
	labelFilter:'OPERATION|MATERIAL|>USEFUL',
	uniqueness: 'NODE_GLOBAL',
	blacklistNodes: projectOps
	}) yield path
with last(nodes(path)) as usefulMaterials
return usefulMaterials.name as name

I have reviewed this with a colleague and we have decided that this solution is not too horrible.

Hello @mdausmann.cmri

Neo4j can manage nodes with multiple labels so that you can add a node label to the nodes you want to exclude.

Regards,
Cobra

Hi @Cobra

Thanks for taking the time to look at my question.

I agree, if the project nodes had an additional tag e.g :-

(o4:OPERATION:OWNED:EXAMPLE{name:'o4'})

then I could use this syntax in my labelFilter

labelFilter:'-OWNED|OPERATION|MATERIAL|>USEFUL',

But I kind of think that the relationship between the :OPERATION and :PROJECT nodes already represents the 'owned' concept in my schema and that adding a label would be redundant and require extra work to keep up to date. I was really hoping to solve this without adding additional labels or attributes.

Regards,
Michael