Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
09-13-2021 09:42 AM
Hello, everyone.
I'm working to load a large CSV file (around 100MB) with 1M lines. I manage to load, but it takes around 2 and a half hour to complete the query.
I tried to create index to certain columns, but then I encounter an issue related to one of my node properties, which is an array of strings.
"Property value is too large to index" is the error that I received, and I think is tied to the property "parent_item", which in the file, it may contain more than 500 string values.
Query:
CREATE INDEX ON :BOM_COMP(item_id); CREATE INDEX ON :BOM_COMP(item); CREATE INDEX ON :BOM_COMP(parent_id); CREATE INDEX ON :BOM_COMP(parent_item); :auto USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM 'file:///items.csv' AS row WITH DISTINCT row.Child_Item_ID as id, row.Child_Item as item, COLLECT(DISTINCT row.Parent_Item) as parent_item, COLLECT(DISTINCT row.Parent_Item_ID) as parent_item_id MERGE (i:BOM_COMP {item_id:id}) ON CREATE SET i.item=item i.parent_item=parent_item, i.parent_item_id=parent_item_id;
Is there a way to optimize this load?
Solved! Go to Solution.
09-13-2021 11:03 AM
Hi @david.casillas ,
You don't really need an index on parent_item yet. (We can discuss about the model of this property later). We need the one on item_id.
Try removing the index for a sec and lemme know.
09-13-2021 11:03 AM
Hi @david.casillas ,
You don't really need an index on parent_item yet. (We can discuss about the model of this property later). We need the one on item_id.
Try removing the index for a sec and lemme know.
09-22-2021 12:02 PM
Thanks, @Bennu.
Using only the index for parent_id solved the performance issue.
All the sessions of the conference are now available online