cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Cypher: Node key or exists + unique

s_decoux
Node Link

Hi!

A quick question (maybe the answer won't be so simple) about the node key constraint:

According that I have a node with a property that must be unique and must exists (eg: social security number). Do I need to create a NODE KEY constraint for that one property

CREATE CONSTRAINT ON (p:Person) ASSERT (p.ssn) IS NODE KEY

or do I need to create to constraints (EXISTS + UNIQUE) ?

CREATE CONSTRAINT ON (p:Person) ASSERT EXISTS(p.ssn)
CREATE CONSTRAINT ON (p:Person) ASSERT p.ssn IS UNIQUE

Is there a difference between the two solution in the core engine ?

Note that the question is about one property only and not a list of property.

Thanks

Sebastien

1 ACCEPTED SOLUTION

There isn't really a difference between these two approaches, in the end the result is the same.

If you think of SSN as your node key, I'd say it's better to use NODE KEY because it's shorter and expresses what you mean, rather than implying what you mean with constraints.

https://neo4j.com/docs/developer-manual/current/cypher/schema/constraints/#query-constraint-node-key

Separately -- just a tip from someone who has done data modeling for years -- SSN is not a good key for database systems, because it isn't unique. This is surprising, but true. If you have a small customer base, maybe you won't ever run into this, but if you grow, you for sure will.

https://www.aol.com/2010/08/12/your-social-security-number-may-not-be-unique-to-you/

Another weirdness of SSNs that trip a lot of people up -- they can have letters behind them. People think they must always be 123-45-6789 but that's not correct. The value 123-45-6789aa is a valid SSN.

http://rothkofflaw.com/what-do-the-letters-after-a-social-security-or-medicare-number-mean/

The reason I mention this is that using SSN as a key in a database is a rather classic "gotcha" problem that crops up later.

View solution in original post

2 REPLIES 2

There isn't really a difference between these two approaches, in the end the result is the same.

If you think of SSN as your node key, I'd say it's better to use NODE KEY because it's shorter and expresses what you mean, rather than implying what you mean with constraints.

https://neo4j.com/docs/developer-manual/current/cypher/schema/constraints/#query-constraint-node-key

Separately -- just a tip from someone who has done data modeling for years -- SSN is not a good key for database systems, because it isn't unique. This is surprising, but true. If you have a small customer base, maybe you won't ever run into this, but if you grow, you for sure will.

https://www.aol.com/2010/08/12/your-social-security-number-may-not-be-unique-to-you/

Another weirdness of SSNs that trip a lot of people up -- they can have letters behind them. People think they must always be 123-45-6789 but that's not correct. The value 123-45-6789aa is a valid SSN.

http://rothkofflaw.com/what-do-the-letters-after-a-social-security-or-medicare-number-mean/

The reason I mention this is that using SSN as a key in a database is a rather classic "gotcha" problem that crops up later.

Hi David!

Thank you for your quick response.

The SSN used in my question is not my use case and was used as an example.

I didn't know a SSN is not unique though. You can learn everyday.