cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

When to create an Index / Uniqueness Constraint

admin3
Node Clone

Hi,

First things first, I am a newbie. I never worked with databases before in order to have experience in the matter.

I would like to ask some questions.

I am assuming, you cannot index everything since it uses memory which is not infinite. So I am a bit reserved in creating indexes as I prefer to wait and see where an issue will arise.

These are my questions for now:

  1. At how many nodes is it good to create an index? For example now that we are at our startup phase we only have 1000 nodes (which we use MATCH on 'uuid' property). I am thinking that I can wait a bit before creating the index, lets say until I reach million nodes.
  2. Some labels just partition a "larger" label. E.g. I have a label called "Main" which is on 2000 nodes. These 2000 nodes have a secondary label, some of them "SUV" and others "Estate". Should I create a uniqueness constraint 1. on the "Main" label, 2. on the two other labels "SUV" and "Estate" or 3. on both "Main" and "SUV" and "Estate". This is important in order to create my MATCH clauses. Usually I use the "smaller" labels but there are cases where I have to use the "larger" label.
  3. It is better to wait for something to become slow before creating an index?
  4. How can you know that you have over-indexed your app?

Thanks

1 REPLY 1

intouch_vivek
Graph Steward

Hi Alex

In my point of view

  1. There is no fix rule which says you need to have these many node count with specific label to bring index into the picture. It all depends on the use case and experience. You can try count>0.1M to test.
  2. If the primary key of larger label and smaller labels are same then keep uniqueness constraint on larger should be sufficient else you need to keep uniqueness constraint on respective properties.
  3. It is not to wait but yes do testing on the bigger data set to incorporate it.
  4. Same as above.