cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to save the result of a query (sub graph) in cypher

Hi all, this is my query:

match(n)-[r]->(m) where not m.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] and not n.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] return n,r,m

The result of the query will be a subgraph. How can I save this subgraph for future use or use it in some other query?

Thanks in advance

1 ACCEPTED SOLUTION

Here is a solution that I used, Collected the ids of nodes, created a node and added the ids list as a property. 

Step: 1
match(n)-[r]->(m) where not m.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] and not n.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] 
with collect(distinct id(n)) as n1, collect(distinct id(m)) as m1
with apoc.coll.union(n1, m1) as res1
with apoc.coll.sort(res1) as final

merge (a:SubGraph {name: "xyz", ids: final})

Step 2:

match (c:SubGraph) where c.name = "xyz"
with c.ids as sub1
unwind sub1 as sub2
with collect(sub2) as s1
match(x) where id(x) in s1
return x

Displays the subgraph. Another option is to use apoc.export.json.query to create a json file.

View solution in original post

8 REPLIES 8

Hi @abhishekjayant1111 ,

Unfortunately there's no way to save a sub-graph in Neo4j itself. While this may possible some day when multi-graph handling has become available, today you have to manage this manually.

There are two main approaches:

  1. mark the query results with a label on the nodes, and a special property on the relationships
  2. save the graph on the client-side, using it as the starting point for subsequent queries

Neo4j Bloom, for example, uses the second approach.

Hope that helps.

Best,
ABK

Here is a solution that I used, Collected the ids of nodes, created a node and added the ids list as a property. 

Step: 1
match(n)-[r]->(m) where not m.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] and not n.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] 
with collect(distinct id(n)) as n1, collect(distinct id(m)) as m1
with apoc.coll.union(n1, m1) as res1
with apoc.coll.sort(res1) as final

merge (a:SubGraph {name: "xyz", ids: final})

Step 2:

match (c:SubGraph) where c.name = "xyz"
with c.ids as sub1
unwind sub1 as sub2
with collect(sub2) as s1
match(x) where id(x) in s1
return x

Displays the subgraph. Another option is to use apoc.export.json.query to create a json file.

Hey @ameyasoft , thanks a lot for the reply. I didn't think of this way, will definitely help me out.

Only replying here because there might be for those who stumble across this later.

There's no problem if the data in your graph never changes once it's loaded. But, be careful here for two reasons:

1 - internal node id's returned by the id() function are not stable outside of a transaction. They can be reused. See the docs here. Node with internal id of 100 may not be the same node later as id's get reused.

2 - The above does not take into account relationships. You might be making some assumptions based on the way the Neo4j browser behaves when you have the "connect result nodes" option enabled. It will prefetch and return data "in the neighborhood" as a convenience feature. See what happens when you disable for the above.

Solution would be to create your own uuid for each node and relationship, then use something similar to above to recreate the results. Of course if anything in your db changes (e.g. property value on a returned node, removed or added nodes / relationships, etc.) you will not be able to recreate what was seen in the original query. The state and existence of nodes and relationships are not preserved.

Hey @abk, thanks for your reply. The first method is easy but also naive. Can you pls explain the 2nd method in some detail? Would appreciate the help 🙂

Abhishek - you might be able to use the function apoc.refactor.cloneSubgraphFromPaths.

match path=(n)-->(m) where not m.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] and not n.name in ['JANSSEN','PFIZER\BIONTECH','ANOFI PASTEUR','Hospital','Recovered'] 
WITH path
CALL apoc.refactor.cloneSubgraphFromPaths([path], {}) 
YIELD input, output, error
RETURN input, output, error;

You will have to have some way of identifying the original nodes from the existing nodes found in the path. See the documentation for some ideas. Or put in a property that indicates "original data" and then skip that property when cloning

The problem here is not cloning, but saving the query results for subsequent reproduction as and when required. Many users use the database and run their own anlaytics and want to save the query results,

I think for that purpose, saving the query is better, rather than duplicating data, unless the query takes very very long time...