Neo4j

12kunal34 · ‎05-26-2020

Hi Everyone,

I am running below query with is dealing with large volume of data and it enters in a forever loop.

match (t1)-[:IN|OUT]-(a1:add) where a1.add= 'abc' return collect(DISTINCT t1) AS tx

is there any way to make this working ??
I am having index on add property and total number of nodes in db around 700000000 and total relationships are around 2400000000

Cobra · ‎05-26-2020

Hello @12kunal34

How many RAM did you give to Neo4j?

Regards,
Cobra

12kunal34 · ‎05-26-2020

It is 64 gb machine for neo4j

intouch_vivek · ‎05-26-2020

Hope you have index on a1.add in place.
Try to have label for t1. Incase you have multiple labels then have try individual at a time.
Are you sure you need to collect complete node, if possible just take primary property of t1
As this is just Match statement lets try to have parallel processing using apoc.periodic.commit()

12kunal34 · ‎05-27-2020

@intouch.vivek

Yes, I have index on a1:add.
I am having only one label for t1 and checked with the label as well but the same result.
yes I need to collect the whole record since I need to pass the whole record in another query as list and need to use some of the properties of the node
I believe apoc.periodic.commit() doesn't yield the results so could you please suggest me cypher syntax with apoc.periodic.commit() for returning result ?

intouch_vivek · ‎05-27-2020

Oh yes it is not apoc.periodic.iterarte() rather than apoc.periodic.commint().

As you have two types of relationships, try to execute one at a time..

12kunal34 · ‎05-27-2020

@intouch.vivek
I tried with commit and got below screen:

could you please suggest me, why getting this ?
new query is:

CALL apoc.periodic.commit("match (t1)-[:IN]-(a1:add) where a1.add= 'abc'  WITH collect(DISTINCT t1) AS tx LIMIT $limit return tx", 
  {limit:1000}
)

12kunal34 · ‎05-28-2020

Yes i tried with One relationship at a time still same result.
Note: if i will run both relationships together then distinct count would be 44000 nodes

llpree · ‎05-27-2020

You might try to first run this in two segments - first get a match for just a1; then use WITH a1, and include the rest of the query. My guess is the issue is with accessing the attribute although you may need to run two queries to address IN and OUT separately. First see if you can run a simple match on (a1 {add: 'abc'}) and get results back reasonably..

You also may want to reconsider using collect(). Why is that needed? Vs. just DISTINCT t1?

HTH

12kunal34 · ‎05-28-2020

I worked on your inputs as well but same result .
and i am using collect because i need a list of nodes that i need to pass in apoc.map.parallel2

Neo4j

Query running for forever