Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-30-2022 11:28 PM
Hi, I'm using neo4j for quite a while now, so am familiar with the basics.
So my dataset contains multiple CSVs - nodes & edges files, which I import using bulk loader.
It contains various columns with integers and string values. Now I need to index them - multiple of them.
Before that here are some stats:
Total server RAM: 128 GB
Total CSV size: 34 GB
### neo4.conf -> configured as per results from neo4j-admin memrec
Initial & max heap size: 31 GB
Page Cache size: 78 GB
Now when I load the data and start the neo4j server, the initial RAM increases upto 32GB, which is reasonable because of the heap size. But when I index a column from node CSV (it contains integers ranging from 1-4):
CREATE INDEX word FOR (t:TOK) ON (t.word);
The RAM boosts upto ~ 65 GB. Now when I index other columns with the same datatype, the RAM increases only by 3 gigs or less, for each and every new column.
I tried changing the order in which I was indexing the columns, but the general pattern is that the 1st index occupies maximum RAM, while the next ones take significantly less.
Now here are my questions:
All the sessions of the conference are now available online