Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-04-2019 09:55 PM
Hi I am trying to compare Neo4j and Protege storage size with the same amount of information. Based on lots of literatures the LPG format should be ten times smaller in the size comparing with RDF triples. However, in my case which only has 130 nodes and 130 edges, the size (300 kB) is much larger than those in Protege (100kB). Even if I ignore the logic log(100kB), it is still larger. My questions are:
Thanks.
Solved! Go to Solution.
11-06-2019 05:57 AM
Hey,
I think those different Neo4j stores store things in blocks of 8KB, and you can see that a lot of the sizes are a multiplier of 8. That means you can likely add a lot more data without the sizes shown by :sysinfo
changing.
I don't think (off the top of my head) that there's a way you can have it flush to disk the exact amount of data that is there - it's always done in blocks of 8KB.
But if you load in more data that 8KB block size stops being as much of an issue as it is when you've only loaded a hundred or so nodes and relationships.
Cheers, Mark
11-05-2019 01:36 AM
Hi!
Perhaps some general observations might help triage what's going on:
I don't know if you've been using the Neosemantics plugin to import your data (https://neo4j.com/labs/nsmtx-rdf/), also you may find the following post useful when thinking about how you might want to tackle reified nodes, which would reduce the amount of storage required (and likely be the more appropriate model for LPG):
11-05-2019 06:23 PM
Thanks Lju. Actually I didn't import any thing between, I built the nodes and relationships one by one separately in two software. And I believe now they have same information range.
11-06-2019 05:57 AM
Hey,
I think those different Neo4j stores store things in blocks of 8KB, and you can see that a lot of the sizes are a multiplier of 8. That means you can likely add a lot more data without the sizes shown by :sysinfo
changing.
I don't think (off the top of my head) that there's a way you can have it flush to disk the exact amount of data that is there - it's always done in blocks of 8KB.
But if you load in more data that 8KB block size stops being as much of an issue as it is when you've only loaded a hundred or so nodes and relationships.
Cheers, Mark
11-06-2019 05:12 PM
Appreciate it mark! It does help.
All the sessions of the conference are now available online