Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-14-2020 04:38 PM
Hello,
I apologize in advance if this has been asked already.
I have a dataset which has different entities and sub-entities.
Conceptually, I have for example the following node labels:
User
ForumUser
EmailUser
SubForum1User
SubForum2User
Now, all nodes will have the label User
.
All nodes with label SubForumXUser
will have labels ForumUser
and User
.
Would there be any disadvantage in keeping a global my_id
field on all nodes that have label User
and use that for indexing, as opposed to using a different id field for each label type?
There are queries I will want to perform to retrieve a subset of nodes that have label ForumUser
.
And for other queries, other subsets of other labels.
From my understanding, specifying the label in the query will already reduce the range of nodes to be considered because the label defines an anchoring point.
But for the search within a specific label, would I have a performance penalty for having a global my_id
for all nodes and indexing it with label User
?
I ask this because such a solution would allow me to express queries more simply.
Or would I get better performance by using a label_X_id
for indexing each label X at the expense of having more indices?
Thank you for your time.
08-15-2020 12:12 AM
Hello @mcbr and welcome to the Neo4j community
Regards,
Cobra
08-15-2020 05:08 AM
Thanks for the information.
So for example, lets imagine I go with a global id called my_id
for all entities (they all have the label User
) and created a UNIQUE CONSTRAINT like this:
CREATE CONSTRAINT global_id_constraint
ON (user:User) ASSERT user.my_id IS UNIQUE
This way I have a global ID field for users, no matter the labels they have.
From what I understand, this constraint also creates a single-property index for my_id
.
So now for example, if I create a full text index over the field full_name
that all nodes with label User
have, I could do:
CALL db.index.fulltext.createNodeIndex("nameTextIndex",["User"],["full_name"])
If I wanted to do a text query only on nodes that have label SubForum1User
, would it be efficient with that nameTextIndex
which was only defined for label User
and the single global ID my_id
?
Would the query planner be smart enough to still use the nameTextIndex
efficiently even though we are anchoring to another label SubForum1User
(remember that the nodes of SubForum1User
also have label User
)?
Thanks again.
08-15-2020 05:24 AM
This way I have a global ID field for users, no matter the labels they have.
From what I understand, this constraint also creates a single-property index formy_id
.
Yes, you have a global ID for all users but you must specify the property that must be unique, it can be an id or something else, but this id must be already in your data. And of course, it will be a property so you will can still use it to do what you want.
If you look here, you can define a global search index (you put all your labels for example) and after when you call it, you only specify the label and the property you need to search on
All the sessions of the conference are now available online