Neo4j

Hugo_Mendes · ‎02-09-2020

So..
Let's suppose I want to create a Followers table in a relational database.
We'd have id, folower_id, folowee_id, created_at
If a user has 1million followers, we could have performance issues depending on the query we'd use here.
The idea then is to actually have a "reading structure" that could maybe get it from cache. e.g. Redis.
Like this I'd have my Relational DB as source of truth, but I'd also have some structure to bring my followers quickly. It could be my most recent followers cached, or cache them based on geolocation, etc.

My question is... Should I use, in this case, Neo4J as source of truth? Or it's better to have a Relational DB as source of truth and the GraphDB synced with it, but with a limited amount of data?

I hope the question is clear.

Thanks.

mike_r_black · ‎02-09-2020

I would have Neo4j as the source of truth. To scope this out as a micro-service, this would a social micro-service that would own any social interaction data. Otherwise if you went to a RDMS first and then later to Neo4j, your RDBMS is going to become this huge monolith database that holds data for all other micro services. Split up each service's domain and responsibility. Neo4j is an ACID database so there's no worry about data loss from uncommitted transactions.

I think you'll find if you use Neo4j as the source of truth for this social data, you'll eliminate the need for even caching layers such as Redis, thus reducing complexity in your stack trying to worry about staleness of data in a cache or worrying about latency replicating data from RDBMS to a Neo4j.

View solution in original post

mike_r_black · ‎02-09-2020

I would have Neo4j as the source of truth. To scope this out as a micro-service, this would a social micro-service that would own any social interaction data. Otherwise if you went to a RDMS first and then later to Neo4j, your RDBMS is going to become this huge monolith database that holds data for all other micro services. Split up each service's domain and responsibility. Neo4j is an ACID database so there's no worry about data loss from uncommitted transactions.

I think you'll find if you use Neo4j as the source of truth for this social data, you'll eliminate the need for even caching layers such as Redis, thus reducing complexity in your stack trying to worry about staleness of data in a cache or worrying about latency replicating data from RDBMS to a Neo4j.

Hugo_Mendes · ‎02-09-2020

Thanks for the answer, Mike!
Good to know! Do you know if I'd get any performance problems? Or even if there's any UseCase of someone using it as source of truth for a social service itself? (Not sure if I should ask this as another topic).
My app would start with a few million followers from day 1. It's a migration from a different system.

Thanks again for the help!

mike_r_black · ‎02-09-2020

Adobe switched to Neo4j for it's social platform and saw huge benefits. Better performance and easier to maintain from an operations stand point.

http://www.odbms.org/blog/2018/07/on-using-graph-database-technology-at-behance-interview-with-david...

A social service is a prime use case for a graph database. Just keep in mind that much of experience depends on how you model the data. The data model needs to support the queries you are going to be writing. One of my all-time favorite videos explaining this one is this link https://youtu.be/oALqiXDAYhc then it'll be worth the time to research the several blogs about graph data model of Twitter if you're going to be implementing any sort of a news feed from there social media service you're building.

Hugo_Mendes · ‎02-09-2020

Thanks, Mike! You really helped me here! Thank you!!

Neo4j

Should GraphDB be used alongside Relational DBs?