Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-09-2020 10:13 PM
I am having a use case where we download files from different cloud vendors like Google drive, One drive, Box etc.
So each file can have label File, source(Dropbox or Box or OneDrive etc)
This way, If i want to query all Files irrespective of source, i can query using File Label.
Now i want to understand performance impact if i query using Individual label by listing all cloud sources like Dropbox, box etc instead of using File Label
If there is no difference, i can remove File Label right? Also i want to understand impact of adding a label regarding memory and performance aspect.
Can someone explain in detail its impact?
12-12-2020 11:59 AM
As I understand it, a Node Label is a set-like collection of all Nodes that share that label. (Actually pointers to the Nodes.) So when you do:
MATCH (n:MyLabel)
instead of
MATCH (n)
Cypher only has to do a linear search through the smaller subset of MyLabel
vs. all Labels
. The former is obviously quicker.
The overhead for keeping File
label is that Neo4J has to keep an extra data structure (something like a Set) to track all the File
Nodes.
Whether this makes sense for you or not, depends on your individual situation. Some things to consider:
File
s of some type?MATCH
on all File
s? Or may need to?File
Label (node type) in the future?If you do want match all files in the future and you don't have a File type, then you might have to do something like:
MATCH(f1:Dropbox)
MATCH(f2:Box)
MATCH(f3:OneDrive)
...
Instead of a simpler MATCH(f:File)
. The later should be faster.
And if should be required to support a new File
type, then you probably will have to modify your Cypher code to deal with the File type (which is a pain and a potential source of errors because you might forget to fix some of your code.)
If you are really short of memory, then consider compressing the File names/paths.
I vote in keeping File
type for now, as it will give you greater flexibility and make your code more robust to unforeseen changes. Maybe after a while when you discover you don't really need to have the File
Label, you can then safely REMOVE
it.
All the sessions of the conference are now available online