Neo4j

igordata · ‎08-18-2020

Hi everyone,

I have a node and I want to get all connected nodes in several steps.

MATCH (n:tx)-[:tx2tx*1..3]-(m:tx)
WHERE ID(m)=1791789334
RETURN *

What I get is all possible paths with node m in them, in total about 2k paths. I don't need the path information at all. I have about 40 nodes and 60 relationships. How can I get just a list of nodes and a list of relationships?

Adding DISTINCT doesn't work, collect() too.

igordata · ‎08-20-2020

Apoc rules.
I end up with this and it works great and damn fast


MATCH (t:tx) where id(t) = 1791789334
CALL apoc.path.subgraphAll(t, {
    relationshipFilter: "tx2tx",
    minLevel: 0,
    maxLevel: 4
})
YIELD nodes, relationships
UNWIND nodes as node
WITH node, nodes, relationships
OPTIONAL MATCH po=(a:addr)--(o:output)--(node)
OPTIONAL MATCH pf=(tf:tx)--(o) WHERE NOT tf IN nodes
RETURN
nodes,
relationships,
apoc.coll.toSet(apoc.coll.flatten(collect(o))) AS onodes,
apoc.coll.toSet(apoc.coll.flatten(collect(relationships(po)))) AS orels,
apoc.coll.toSet(apoc.coll.flatten(collect(a))) AS anodes,
apoc.coll.toSet(apoc.coll.flatten(collect(tf))) AS fnodes,
apoc.coll.toSet(apoc.coll.flatten(collect(relationships(pf)))) AS frels

Thank you guys for your help!

View solution in original post

Cobra · ‎08-18-2020

Hello @igordata and welcome to the Neo4j community

You are not too far away:

MATCH (n:tx)-[r:tx2tx*1..3]-(m:tx)
WHERE ID(m)=1791789334
RETURN collect(DISTINCT m) AS nodes, collect(DISTINCT r) AS relations

Regards,
Cobra

igordata · ‎08-18-2020

Thanks, but it returns same nodes several times and all relations come in arrays of 3 and it looks like it's relations from paths in the right order like paths has them. Is there a way to get only nodes and relations just a two lists of them?

Cobra · ‎08-18-2020

Can you show me what it returns with the query and what you would like to get please (a little example)?

igordata · ‎08-18-2020

This is a part of data I get. The relationship with id 1546728594 is included like about 16 times in my data.

,
    "relations": [
      [
        {
          "identity": 1546728594,  <-- Here it goes
          "start": 1927833568,
          "end": 1791789334,
          "type": "tx2tx",
          "properties": {

          }
        }
      ],
      [
        {
          "identity": 1546728593,
          "start": 1927833568,
          "end": 1785591632,
          "type": "tx2tx",
          "properties": {

          }
        },
        {
          "identity": 1546728594, <-- and here
          "start": 1927833568,
          "end": 1791789334,
          "type": "tx2tx",
          "properties": {

          }
        }
      ],
      [
        {
          "identity": 1803520488,
          "start": 2005583212,
          "end": 1785591632,
          "type": "tx2tx",
          "properties": {

          }
        },
        {
          "identity": 1546728593,
          "start": 1927833568,
          "end": 1785591632,
          "type": "tx2tx",
          "properties": {

          }
        },
        {
          "identity": 1546728594,  <-- and again
          "start": 1927833568,
          "end": 1791789334,
          "type": "tx2tx",
          "properties": {

          }
        }
      ],
      [
        {
          "identity": 1679121836,
          "start": 1968412308,
          "end": 1785591632,
          "type": "tx2tx",
          "properties": {

          }
        },
        {
          "identity": 1546728593,
          "start": 1927833568,
          "end": 1785591632,
          "type": "tx2tx",
          "properties": {

          }
        },
        {
          "identity": 1546728594,  <-- and so on...
          "start": 1927833568,
          "end": 1791789334,
          "type": "tx2tx",
          "properties": {

          }
        }
      ],

Sorry, if I explain the problem not clearly enough. I need to have a list of unique relationships to get more steps from my starting node. With 0..5 from m everything becomes extremely huge and slow. But if I could get rid of paths - that could save me lots of time and memory.

Cobra · ‎08-18-2020

MATCH (n:tx)-[r:tx2tx*1..3]-(m:tx)
WHERE ID(m)=1791789334
RETURN apoc.coll.toSet(apoc.coll.flatten(collect(m))) AS nodes,
       apoc.coll.toSet(apoc.coll.flatten(collect(r))) AS relations

igordata · ‎08-18-2020

got error about apoc unknown function, I'll read about how to enable it, it looks like it is a must have thing, thanks

Fanny · ‎12-03-2021

Thank you very much! APOC is so powerfull! Your solution downsized my JSON export from 73Mb (with so many duplicates) to 500Kb.

Cobra · ‎08-18-2020

Yeah, you must install APOC plugin

If you can't do it in Cypher, APOC will myabe do

igordata · ‎08-18-2020

Found that kind of solution:

// uniq
MATCH ptx=(n:tx)-[:tx2tx*1..3]-(m:tx)
MATCH po=(o:output)-[rtxo]-(n)
WHERE ID(m)=1791789334
UNWIND nodes(ptx) as node
UNWIND relationships(ptx) as relationship
UNWIND nodes(po) as outputs
RETURN collect(distinct node) as nodes,
       collect(distinct relationship) as relationships,
       collect(distinct outputs) as outputs

Not sure how beautiful it is, but it works fine for now

igordata · ‎08-18-2020

It's not a perfect solution due to I'm still getting duplicates from po that are already in ptx but I'll stick with that one for this week

Cobra · ‎08-18-2020

You could merge the list with a comprehension list

igordata · ‎08-18-2020

Sorry, how? Could you please provide me an example?

I already grew my query to that

// uniq with outputs with fringe
MATCH ptx=(m:tx)-[:tx2tx*1..3]-(n:tx)-[:tx2tx]-(f:tx) WHERE ID(m)=1791789334
MATCH po=(o:output)-[rtxo]-(n)
MATCH pa=(a:addr)-[ra]-(o)
UNWIND n as txnode
UNWIND f as fnode
UNWIND relationships(ptx) as txrel
UNWIND o as onode
UNWIND relationships(po) as orel
UNWIND a as anode
UNWIND ra as arel
RETURN
collect(distinct txnode) as txnodes,
collect(distinct txrel) as txrels,
collect(distinct onode) as onodes,
collect(distinct orel) as orels,
collect(distinct fnode) as fnodes,
collect(distinct anode) as anodes,
collect(distinct arel) as arels

I get n nodes in f nodes list and it is bad. Is there a way to politely ask f nodes to be only those who are not n already? I mean I get 30 n nodes and 300 f nodes, and this 30 n are included to f.

Thank you, btw

ameyasoft · ‎08-18-2020

Try this:

MATCH (c) WHERE id(c) = 1791789334 CALL apoc.path.subgraphAll(c, {}}) YIELD nodes, relationships 
UNWIND nodes as n1
UNWIND relationships as r1
RETURN distinct type(r1) as rel, count(r1) as Cnt2, labels(n1) as lbl, count(n1) as Cnt order by rel

igordata · ‎08-20-2020

Apoc rules.
I end up with this and it works great and damn fast


MATCH (t:tx) where id(t) = 1791789334
CALL apoc.path.subgraphAll(t, {
    relationshipFilter: "tx2tx",
    minLevel: 0,
    maxLevel: 4
})
YIELD nodes, relationships
UNWIND nodes as node
WITH node, nodes, relationships
OPTIONAL MATCH po=(a:addr)--(o:output)--(node)
OPTIONAL MATCH pf=(tf:tx)--(o) WHERE NOT tf IN nodes
RETURN
nodes,
relationships,
apoc.coll.toSet(apoc.coll.flatten(collect(o))) AS onodes,
apoc.coll.toSet(apoc.coll.flatten(collect(relationships(po)))) AS orels,
apoc.coll.toSet(apoc.coll.flatten(collect(a))) AS anodes,
apoc.coll.toSet(apoc.coll.flatten(collect(tf))) AS fnodes,
apoc.coll.toSet(apoc.coll.flatten(collect(relationships(pf)))) AS frels

Thank you guys for your help!

Neo4j

How to avoid getting paths and get only list of unique nodes and relationships?