Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
09-30-2018 04:05 AM
Hello,
I tried apoc.periodic.iterate.sub-batching.cypher example code from https://gist.github.com/jexp/caeb53acfe8a649fecade4417fb8876a, but failed with belows error.
Failed to invoke procedure apoc.periodic.iterate
: Caused by: org.neo4j.cypher.internal.util.v3_4.SyntaxException: Unknown function 'apoc.coll.partition' (line 2, column 7 (offset: 67))
apoc.periodic.iterate.sub-batching.cypher
CALL apoc.periodic.iterate(
"LOAD CSV WITH HEADERS FROM 'FILE:///dropd_noun.csv' AS line
WITH apoc.coll.partition(collect(line),10000) AS batchesOfLines
UNWIND batchesOfLines as batch
RETURN batch",
"UNWIND {batch} AS word
MERGE (w:Word {word: word.sentence_noun})",
{batchSize: 1, parallel: true});
I made similar cypher code with above apoc.periodic.iterate.sub-batching.cypher, it works. dropd_noun.csv has one column, sentence_noun column
LOAD CSV WITH HEADERS FROM 'FILE:///dropd_noun.csv' AS line
WITH collect(line) AS nounlists
UNWIND nounlists AS nounlist
CREATE (w:Word {word:nounlist.sentence_noun} )
Thank you,
09-30-2018 05:27 AM
Which apoc version do you have?
Does it find the function otherwise?
Why do you do it so complicated? It's all built into periodic iterate.
CALL apoc.periodic.iterate(
"LOAD CSV WITH HEADERS FROM 'FILE:///dropd_noun.csv' AS line
RETURN line",
"MERGE (w:Word {word: word.sentence_noun})",
{batchSize: 10000, iterateList:true, parallel: true});
09-30-2018 11:36 AM
Here're the answers.
~/neo4j/plugins$ ls
apoc-3.4.0.3-all.jar
neo4j> CALL apoc.coll.partition([1,2,3,4,5,6], 5) YIELD value
RETURN value;
+-----------------+
| value |
+-----------------+
| [1, 2, 3, 4, 5] |
| [6] |
+-----------------+
2 rows available after 6 ms, consumed after another 1 ms
3, Why do you do it so complicated? It's all built into periodic iterate.
The reason why I test apoc,periodic.iterate, I read below git comment.
https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/714.
3. Use apoc.periodic.iterate
* the biggest benefit of using iterate() is you don't need a large Heap memory anymore;
* iterate() can split large update into smaller batches to execute and submit, which keeps the Heap memory usage low;
* iterate() can also leverage the CPU power by running updates in parallel (set parallel:true). This works particularly well for SSD, but avoid using it on mechanical HD;
Thank you,
Sun
10-01-2018 04:58 AM
So partition
a procedure, not a function, so it would have to be called differently in your first example.
But I wouldn't recommend that kind of use anyway, except if one really knows why they are using that approach, and suggest to use the built in functionality.
All the sessions of the conference are now available online