Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
04-21-2020 01:17 PM
Hello,
I am using Neo4J Community 3.5.17. I wanted to find the closest :Fraud node to a :Person node for a list of fids(Person unique identifier) and am using the following query. Please note fids are currently of string type whereas the input and output files contain fid in int type.
profile cypher runtime=interpreted load csv with headers from 'file:///shortest_path_data/test.csv' as line with line.fid as fid match (n:Person) where n.fid=toString(fid) with n call apoc.path.expandConfig(n,{labelFilter:'/Fraud', maxLevel:10, optional:true, limit:1}) yield path return toInteger(n.fid) as fid,length(path)/2 as distance;
This query worked for an input file of 30K fids, however runs endlessly and triggers GC for 300K fids. Here is the debug log-
2020-04-21 19:55:57.554+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=417944, gcTime=313357, gcCount=11}
2020-04-21 19:59:39.207+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=322338, gcTime=322433, gcCount=10}
2020-04-21 20:05:03.770+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=327987, gcTime=328082, gcCount=9}
Current Heap Size is 31g and pagecache size is 100g
Please guide me with how to go forward with this.
Thanks and Regards,
Kevin
04-21-2020 04:47 PM
if you rerun and include a PERIODIC COMMIT ( https://neo4j.com/docs/cypher-manual/4.0/clauses/load-csv/#load-csv-setting-the-rate-of-periodic-com... ) such that you commit every 5k records, for example does this provide any improvement?
04-21-2020 05:35 PM
In addition to Dana's suggestion, make sure you have an index on :Person(fid)
04-21-2020 10:29 PM
Hi Team,
I do have an index on :Person(fid). On using periodic commit, I get the following error-
Cannot use periodic commit in a non-updating query (line 1, column 36 (offset: 35))
"using periodic commit 5000 load csv with headers from 'file:///shortest_path_data/test.csv' as line with line.fid as fid match (n:Person) where n.fid=toString(fid) with n call apoc.path.expandConfig(n,{labelFilter:'/Fraud', maxLevel:10, optional:true, limit:1}) yield path return toInteger(n.fid) as fid,length(path)/2 as distance;"
Thanks and Regards,
Kevin
04-22-2020 12:14 AM
Ah, you need to prefix the query with :auto
for this to work in the browser or cypher-shell.
For the explanation why, see here:
04-22-2020 03:05 AM
Hi Andrew,
Getting the following error-
Invalid input ':': expected <init> (line 1, column 36 (offset: 35))
":auto using periodic commit 5000 load csv with headers from 'file:///shortest_path_data/test.csv' as line with line.fid as fid match (n:Person) where n.fid=toString(fid) with n call apoc.path.expandConfig(n,{labelFilter:'/Fraud', maxLevel:10, optional:true, limit:1}) yield path return toInteger(n.fid) as fid,length(path)/2 as distance;"
Regards,
Kevin
04-22-2020 04:42 PM
How are you running this? Via the Neo4j Browser (if so which version)? Cypher-shell? Client code via a driver?
04-22-2020 09:54 PM
I am using cypher-shell.
All the sessions of the conference are now available online