cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to periodic commit apoc.load.json + can apocs be nested?

I am trying to import a 80mb json file with around 2mio lines.
Therefore i am running into a out of heap space error.

To prevent the error i wanted to use periodic commit which i was not successfull with.

Thats the command i tried:

CALL apoc.periodic.commit(
"CALL apoc.load.json('file:/master.json')
YIELD value
MERGE (p:Product {name:value.name})
WITH p, value 
UNWIND value.categories as cat
MERGE (c:Categorie {name:cat})
MERGE (c)-[r:INHERITS]->(p)",
{batchSize:100, parallel:true})

Is it even possible to nest two apocs?
And how can i achieve batch import with json files in neo4j?

1 ACCEPTED SOLUTION

I think i found it: there was a sneaky little comma too much inside the UNWIND line.
And i closed the first query after "YIELD value" already

The final working code is the following:

CALL apoc.periodic.iterate(
	"CALL apoc.load.json('file:/master_small.json')
	YIELD value",
	"MERGE (p:Product {name:value.name})
	WITH value, p 
	UNWIND value.categories as cat
	MERGE (c:Categorie {name:cat})
	MERGE (c)-[r:INHERITS]->(p)",
	{batchSize:100, parallel:true})

Thank you @Cobra for sticking with me!

View solution in original post

6 REPLIES 6

Hello @daniel.muellner

Yes it's possible

Did you have a look at apoc.periodic.iterate()?

Regards,
Cobra

Thank you for your confirmation.
Although i read the docs I was confused because it does not show a nested apoc call and i ran into weird errors as i mixed the syntax of periodic.commit() and periodic.iterate().

The corrected script is:
(i added the limit and removed the batchsize params)

CALL apoc.periodic.commit(
"CALL apoc.load.json('file:/master_small.json')
YIELD value
MERGE (p:Product {name:value.name})
WITH value, p limit $limit
UNWIND value.categories as cat
MERGE (c:Categorie {name:cat})
MERGE (c)-[r:INHERITS]->(p)",
{limit:100})

Running this code throws this error:

{
  "No element found in java.util.Arrays$ArrayItr@224f659a": 1
}

When i run it without the apoc.periodic.commit() it works with no errors though:

CALL apoc.load.json('file:/master_small.json')
YIELD value
MERGE (p:Product {name:value.name})
WITH value, p 
UNWIND value.categories as cat
MERGE (c:Categorie {name:cat})
MERGE (c)-[r:INHERITS]->(p)

I never see this error but it looks like something is empty

Can you try with apoc.periodic.iterate()?

I don't think commit is the right function for what you want to

I will give you some more insights into my plans:

I want to import a json file with the following structure:
2X_f_f242e47a072128b7006ddc6e7a2fdd8b47c5a1dd.png
(just with a lot more entries and properties....you get the point)

My goal is to import all products as nodes and categories as nodes.
As you can see the categories are repetetive and hierarchical. So i want no duplicates of the categories.
Without the periodic import and a small json the result is this:


Top level category "Garten & Freizeit" is in the middle and the subcategories are next level nodes and at the end there should be the products ("name" inside the json).


Back to your idea: periodic.iterate()
I tried the following code:

CALL apoc.periodic.iterate(
	"CALL apoc.load.json('file:/master_small.json')
	YIELD value
	MERGE (p:Product {name:value.name})",
	"WITH value, p 
	UNWIND value.categories as cat,
	MERGE (c:Categorie {name:cat})
	MERGE (c)-[r:INHERITS]->(p)",
	{batchSize:100, parallel:true})

I received the following error:
(I am not entirely sure where to split the cypher queries (as iterate() needs two?!)

I appreciate your help a lot

CALL apoc.periodic.iterate('
    CALL apoc.load.json("file:/master_small.json")
	YIELD value
	MERGE (p:Product {name:value.name})
    RETURN value, p
    ','
    WITH value, p 
    UNWIND value.categories AS cat,
    MERGE (c:Categorie {name:cat})
    MERGE (c)-[r:INHERITS]->(p)
    ', {batchSize:100, parallel:true})

I think i found it: there was a sneaky little comma too much inside the UNWIND line.
And i closed the first query after "YIELD value" already

The final working code is the following:

CALL apoc.periodic.iterate(
	"CALL apoc.load.json('file:/master_small.json')
	YIELD value",
	"MERGE (p:Product {name:value.name})
	WITH value, p 
	UNWIND value.categories as cat
	MERGE (c:Categorie {name:cat})
	MERGE (c)-[r:INHERITS]->(p)",
	{batchSize:100, parallel:true})

Thank you @Cobra for sticking with me!