Neo4j

cesar · ‎02-22-2019

In this blog post, we give some insight into our data pipeline and the optimisations we made to make it possible to load the entire bitcoin blockchain in a single day:

Let us know if you have any questions!

greta · ‎02-25-2019

Thanks for submitting!

I’ve added a tag that allows your blog to be displayed on the community home page!

neo4j_devrel · ‎02-25-2019

Looking forward to seeing this in the online meetup!!!!! 🙂

ppedra · ‎04-04-2019

Ah! nice!
I read your article and a few had a few questions.
I don't have experience with kafka. Why and how you use it?
also, you import the past data via admin import, ok, but the incoming data, you create manually with, python, using kafka data? Manually like:

CREATE (n:node{info:bla-bla-bla})
CREATE (n)-[:RELATION]->(some_node);

i'm working on a similar problem where i have past data (which i import with neo4j-admin) and new data coming (which i'm not sure who to handle... haha)

if you could talk a bit about it, would be great!

neo4j_devrel · ‎05-24-2019

Here's the Online Meetup if anyone is interested!

cesar · ‎05-28-2019

Hi ppedra! sorry I didn't see your question before.
Your assumption is correct, we first use the csv import tool to bootstrap the database, but to keep it updated we use regular cypher statements from an application written in Scala. At this point, if you use the merge statement, it's important to keep indices in your database as queries (i.e. the first part of a merge statement) can take a looong time.
Would be interesting to know how you solved the problem!

Also, kafka gives us many advantages, but too long to bring up here. We mention some on the blog post as well as on the online meetup. If you have any specific question in this regard, don't hesitate to contact me again!

Cesar