Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-05-2020 05:44 PM
I'm new to graph and I'm evaluating if neo4j would fit my use case.
I have 2 CSV files as follows:-
I anticipate >50million nodes, >20billion relationships)
I have been able to create Nodes using Persons file and Relationships using Calls files through Neo4j admin import.
Challenge comes when deleting relationships for a certain callDate so that I can add newer relationships. It's too/painfully slow for large datasets.
match ()-[r: {callDate:20200101
}]->() delete r;
I found out I can't index relationship properties.
Is there a way to optimize this cypher? How could I possibly re-model my CSVs?
Solved! Go to Solution.
06-06-2020 12:13 PM
Nice, happy to hear this 🙂
The apoc procedure and the index should really speed up your query
Regards,
Cobra
06-06-2020 03:17 AM
Hello @DanielGittx,
Yes, it's possible This request should work:
CALL apoc.periodic.iterate('MATCH ()-[r:{callDate:20200101}]->() RETURN r', 'DELETE r', {batchSize:1000, iterateList:true})
It deletes relationships by batches of 1000 relationships.
Regards,
Cobra
06-06-2020 11:53 AM
Hi @Cobra,
Thanks much. Indeed the apoc you shared works (I just refractored syntax abit). But it's a bit slow for about 10billion relationships I'm working with(6 months data)
I came across this "db.index.fulltext.createRelationshipIndex" as a way of indexing relationship property.
The index is currently populating hopefully the cypher will gain some speed once done
06-06-2020 12:13 PM
Nice, happy to hear this 🙂
The apoc procedure and the index should really speed up your query
Regards,
Cobra
06-08-2020 04:09 AM
Just an update...
The indexing process is very slow.
Considering:-
Neo4j Version:-
Neo4j Browser version: 4.0.3
Neo4j Server version: [3.5.15]
It has taken 3hrs to just get to 12% (index populating)
Why is this and is it possible to fast track?
06-08-2020 04:18 AM
Hello @DanielGittx,
Yeah because it has to index all your database, that's why it's better to do it when you create the database
Regards,
Cobra
06-08-2020 05:14 AM
Agreed, however initially had done a bulk import (neo4j admin import).
Will neo4j admin import preserve indexes if i create them in advance then do a bulk import?
06-08-2020 05:34 AM
If I'm right, the index is set at the importation
Regards,
Cobra
06-08-2020 05:40 AM
I don't think so, especially for relationship indexes
I marked one of your messages as solution because i tested that with a subset of the graph and it worked(was fast) also for the fact that i'm solving a different issue now
06-08-2020 05:45 AM
I don't know more about this topic but I think you right, according to the DOC,
Full-text indexes are powered by the Apache Lucene indexing and search library
so it must be pre-computed already
Regards,
Cobra
06-14-2020 08:59 AM
Daniel,
I would suggest changing your data model to have a day of the call as a node. So it would look like:
(:Person) -[:CALLED_ON]->(:DayOfCall) <-[:RECEIVED_CALL]- (:Person)
Then you can index day of the call with date property - then DELETE request will work much faster. Please note that you will still need to use apoc.periodic.iterate()
All the sessions of the conference are now available online