Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-28-2020 09:47 PM
Hello,
I have to have an Unique Identifier on one of the node is a 38 character string, eg - "37300097819223399953200236708848436046".
Is there anyway to optimize this kind of Unique Identifier. I have an index on this. This fails to load the 9M rows.
I have around 9M records which I am loading into graph with below command -
CALL apoc.periodic.iterate("
UNWIND ['activity.csv'] AS file CALL apoc.load.csv('file:///dcm/' + file, {header:true, sep:',', quoteChar:'"', nullValues:['']}) YIELD
map as row WHERE row.ecid is not null return row
","
WITH row
MERGE (ec:ec_id {ec_id:row.ecid})
ON CREATE SET ec+={create_dt:datetime()}
", {batchSize: 10000, iterateList: true, parallel: false})
Community Edition Server -
Server Size - 16 Core 128GB
Server is running on 70% CPU when loading data
Memory Configs -
dbms.memory.heap.initial_size=31g
dbms.memory.heap.max_size=31g
dbms.memory.pagecache.size=73g
dbms.tx_state.max_off_heap_memory=8g
I am seeing below logs in debug.log and database becomes inaccessible.
2020-05-29 04:34:54.697+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5933, gcTime=5972, gcCount=2}
2020-05-29 04:35:01.850+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5951, gcTime=5963, gcCount=2}
2020-05-29 04:35:08.773+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5910, gcTime=5916, gcCount=2}
2020-05-29 04:35:15.831+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6021, gcTime=6102, gcCount=2}
2020-05-29 04:35:22.785+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6131, gcTime=6173, gcCount=2}
2020-05-29 04:35:29.434+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5847, gcTime=5935, gcCount=3}
2020-05-29 04:35:36.352+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6082, gcTime=6097, gcCount=2}
2020-05-29 04:35:42.847+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5680, gcTime=5773, gcCount=2}
2020-05-29 04:35:49.551+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6101, gcTime=6196, gcCount=2}
2020-05-29 04:35:56.490+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6137, gcTime=6181, gcCount=2}
2020-05-29 04:36:03.019+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5927, gcTime=5974, gcCount=3}
2020-05-29 04:36:09.480+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5960, gcTime=6023, gcCount=2}
2020-05-29 04:36:15.959+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5978, gcTime=6049, gcCount=3}
2020-05-29 04:36:22.346+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5885, gcTime=5933, gcCount=2}
2020-05-29 04:36:28.693+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5945, gcTime=5987, gcCount=2}
2020-05-29 04:36:35.028+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5807, gcTime=5881, gcCount=2}
2020-05-29 04:36:41.493+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6063, gcTime=6132, gcCount=3}
2020-05-29 04:36:47.641+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5747, gcTime=0, gcCount=0}
2020-05-29 04:36:53.606+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5764, gcTime=5796, gcCount=2}
2020-05-29 04:36:59.946+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6039, gcTime=6083, gcCount=3}
2020-05-29 04:37:05.837+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5590, gcTime=5618, gcCount=2}
2020-05-29 04:37:11.823+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=5669, gcTime=5759, gcCount=2}
05-30-2020 01:38 AM
Do you have a constraint on the label and key you merge on?
What neo4j version do you use?
06-01-2020 04:39 PM
Hi @michael.hunger,
Sorry for the late response. Here is how I created the index. Yes, I do have a Unique constraint and I do not have a key as I was using Community Edition.
CREATE CONSTRAINT UQ_EC_ID
ON (n:ec_id)
ASSERT n.ec_id IS UNIQUE;
I am on Community Edition Neo4j 4.0.4.
06-01-2020 05:21 PM
This is really weird, could you try to use LOAD CSV instead of apoc.load.csv and see if it persists?
06-03-2020 10:35 AM
Hi @michael.hunger,
I tried using LOAD CSV as per your suggestion and I am seeing those errors just once in a while but they are going away ultimately. Attaching log below. We've used APOC for all our ingestion jobs and seeing this error in other places too makinf DB unresponsive. Especially when the CSV file size is big (>9M rows around). Is LOAD CSV recommended over APOC procedures for such cases?
Also, it will be helpful if we know what actually this error mean.
2020-06-03 17:35:49.315+0000 INFO [o.n.c.i.ExecutionEngine] [idgraph] Discarded stale query from the query cache after 239 seconds: CALL dbms.listConfig()
2020-06-03 17:35:54.921+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.4] version=3, last transaction in previous log=5466, rotation took 35 millis, started after 87672 millis.
2020-06-03 17:37:28.525+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.5] version=4, last transaction in previous log=7202, rotation took 37 millis, started after 93567 millis.
2020-06-03 17:38:57.762+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.6] version=5, last transaction in previous log=9176, rotation took 34 millis, started after 89203 millis.
2020-06-03 17:40:29.268+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.7] version=6, last transaction in previous log=11420, rotation took 36 millis, started after 91470 millis.
2020-06-03 17:41:58.480+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.8] version=7, last transaction in previous log=13593, rotation took 33 millis, started after 89179 millis.
2020-06-03 17:43:27.346+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.9] version=8, last transaction in previous log=15650, rotation took 34 millis, started after 88832 millis.
2020-06-03 17:44:49.275+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=429, gcTime=0, gcCount=0}
2020-06-03 17:45:10.743+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.10] version=9, last transaction in previous log=17688, rotation took 34 millis, started after 103363 millis.
2020-06-03 17:46:43.655+0000 INFO [o.n.k.d.Database] [idgraph] Rotated to transaction log [/neo4jdrive/data/transactions/idgraph/neostore.transaction.db.11] version=10, last transaction in previous log=19630, rotation took 32 millis, started after 92880 millis.
2020-06-03 17:46:48.741+0000 INFO [o.n.c.i.ExecutionEngine] [idgraph] Discarded stale query from the query cache after 659 seconds:
CALL db.labels() YIELD label
RETURN {name:'labels', data:COLLECT(label)[..1000]} AS result
UNION ALL
CALL db.relationshipTypes() YIELD relationshipType
RETURN {name:'relationshipTypes', data:COLLECT(relationshipType)[..1000]} AS result
UNION ALL
CALL db.propertyKeys() YIELD propertyKey
RETURN {name:'propertyKeys', data:COLLECT(propertyKey)[..1000]} AS result
UNION ALL
CALL dbms.functions() YIELD name, signature, description
RETURN {name:'functions', data: collect({name: name, signature: signature, description: description})} AS result
UNION ALL
CALL dbms.procedures() YIELD name, signature, description
RETURN {name:'procedures', data:collect({name: name, signature: signature, description: description})} AS result
UNION ALL
MATCH () RETURN { name:'nodes', data:count() } AS result
UNION ALL
MATCH ()-[]->() RETURN { name:'relationships', data: count()} AS result
Thanks,
Sai
06-03-2020 04:05 PM
Generally not, it seems to be something particular with your file. not sure if you can share it.
The error means that the virtual machine is trying to free memory and it takes quite long and might not be successful.
06-03-2020 09:04 PM
I am including a row from the file, it has header too just do not want to include it here. Bolded value is the one I created the index on -
"2019-01-18T17:39:12",237249526,,,,,"66690879344007303623597366135687768876","aaklshin",8084382,"C","mathij0"
06-10-2020 07:50 AM
Hi @michael.hunger - I am still seeing this issue and was reaching out to see if you have any suggestions on this.
All the sessions of the conference are now available online