Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-26-2019 02:12 PM
Hey,
I am currently trying to build a test set for an algorithm Problem that I have.
My goal is to build a team building app. Let us think we have a pool of 10.000 Persons each Person can like other Persons and also dislike.
For my test set I am generating 10.000 People with 50-100 relationships for likes and dislikes each per person.
I saw when I am creating my test set that there is not really a way to bulk create relationships for faster imports and test set generations (I know that my test set is large but I think it is not really huge).
The ways I have tried so far: each relationship as one single query (slow), using transactions and bulk multiple persons into one transaction.
Are there other ways to do this?
I am using javascript with neo4j-driver as package.
you can take a look at my code here: https://gist.github.com/phumberdroz/7ab207f852235f97007d7e3a19e7f7e5#file-test-js-L52-L82
02-26-2019 03:00 PM
You could try using APOC Procedures, namely apoc.periodic.iterate() to do batch processing.
Assuming you have 10k :Person nodes in your graph, this query will create 50-100 :LIKES and 50-100 :DISLIKES relationships (all outgoing) for those nodes to random person nodes in the graph. If you don't mind that this can result in :LIKE and :DISLIKE of the same person (which we can fix with a quick list subtraction on the list before we get the random dislikes), this sounds like what you want:
CALL apoc.periodic.iterate("
MATCH (p:Person)
WITH collect(p) as persons
WITH persons
UNWIND persons as p
RETURN p, persons
",
"
WITH p, apoc.coll.removeAll(persons, [p]) as persons, toInteger(rand() * 50) + 50 as likesCount, toInteger(rand() * 50) + 50 as dislikesCount
WITH p, apoc.coll.randomItems(persons, likesCount) as likes, apoc.coll.randomItems(persons, dislikesCount) as dislikes
FOREACH (other IN likes | CREATE (p)-[:LIKES]->(other))
FOREACH (other IN dislikes | CREATE (p)-[:DISLIKES]->(other))
",
{batchSize:100}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages
This completed in 41 seconds on my Macbook Pro for 10k :Person nodes, and resulted in 1489204 relationships being created.
02-27-2019 06:56 AM
Hey Andrew,
Thanks for your assistance here in the community forums as well as in slack.
I ended up using parameters and unwind and it works in a similar time span.
All the sessions of the conference are now available online