Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-28-2020 05:34 AM
Customer would like to migrate legacy neo4j database/app (2.1.4) with over 1 M+ nodes and 3 M+ relationships to Oracle. Attempting to determine reasonable feasibility given lack of neo4j/Cypher expertise. Customer would like to harvest this data and move it into a relational data mart and join it with other corporate data sets. Been investigating whether APOC library could be utilized against this much older version of neo4j. Any assistance would be appreciated.
12-28-2020 09:34 AM
What's the reason the customer wants to migrate the DB? Is it just because they don't understand it? (It seems like it hasn't been maintained if it's back on 2.1.4 so maybe no one know how it works or what to do with it.)
The Neo4J has a big advantage when there are a lot of many-to-many relationships plus the ease of querying.
You don't mention how many Node Labels (which you can think of as Node types) there are. Labels are roughly akin to RDBS Tables. The more different kinds of Labels, the more onerous the conversion. Another advantage of Neo4J is nodes can have multiple Labels, which is like multiple inheritance. If the existing Neo4J has a lot of multiple labels, that will be hard to convert because RDBS doesn't play well with overlapping characteristics. In a RDBS, the Table columns must be fleshed out ahead of time. These columns correspond to Node Properties, which can overlap or be missing (which can be thought of as Nulls.)
You also don't mention how many Relationship Labels there are. Two nodes can connect to each other through one or more relationships. Relationships can also have properties, which will be another sticking point. Since there isn't a fundamental "relationship" type in a Relational DB, it will be a challenge to try to understand where to put a relationship's property into which RDBS table.
Another possible hurdle, is properties in Neo4J are more flexible than in a RDBS. The same property name could contain a string or a float in different Nodes that have the same Label. Furthermore, a property can contain an array, which typically doesn't work very well in a RDMS.
The real challenge of a port from Neo4J to RDBS is not the amount of data, but rather the complexity of the implicit Schema of the Neo4J data which has to be translated (and semi-frozen) into an explicit Relational DB Schema. This task could very well be non-trivial, depending on the richness of the Neo4J database. This has to be done carefully, as if you get the schema wrong, you may have to redo the schema or worse, reimport again.
I suspect it might be better to connect the traditional RDBS with Neo4J using one of the various APIs (see: How do I use Cypher to connect to a RDBMS using JDBC - Knowledge Base ). This is because a good DB architect would have chosen Graph DB over a Relational DB because of the complex of the data. (On the other hand, it could be that a DB architect chose Neo4J because they thought it would be cool to try it out.)
One of the great ironies of Relational DBs is they aren't very good with (complex) relationships!
In any case, I think you should upgrade the Neo4J to 4.2.1 and go from there. The APOC library has advanced a lot since 2.1.4.
It also might be easier to export the Neo4J into CSV or JSON, and then import into Oracle.
In any case, it's hard to say how hard or easy this will be with the little info you provided...
I hope this helps.
12-28-2020 10:08 AM
Hi @enelias, coming from Oracle and MS SQL Server Data Architect domain, I have seen both worlds of NoSQL and RDBMS.
To be honest, Oracle and MS SQL Server still dominates the world, because of their "Enterprise" Architecture(sorry Neo4j).
Given the fact the Oracle has Graph -> What Is a Graph Database? | Oracle, I have designed and implemented 20+ Enterprise Scale Oracle Graph solutions.
I totally agree on the Cost of Licensing etc etc, but all the ACID and Normal Forms still rules.
Before, I give a further lecture, what is your use case ? Is your database an Analytical / a Data Warehouse / a transactional / a data mart ??
12-28-2020 10:30 AM
Thanks for the insight. More background: Govt customer does not have any inhouse neo4j expertise any more. Application and db continue to run with little or no support. Would like to make this data more accessable and combine it with other current datasets already residing in Oracle. The number of node labels is 11. And yes, some nodes do contain multiple labels. The number of relationship types is 15. The orignal design/implementation team is long gone from this effort. As the new guy on this project just trying to determine level of effort to migrate the data if possible.
12-28-2020 01:23 PM
Neo4j 2.1.4 is some 6 yrs old ( Neo4j 2.1.4 - Neo4j Graph Database Platform ) as it was released in 2014. APOC was not made available until Neo4j v3.x and so the 2 are not compatible.
12-29-2020 04:10 AM
Great info.So no APOC and as previously thought, no direct method for migrating neo4j (graph) to Oracle (relational). Last question: Is there a Cypher command or CLI to dump the entire contents of the db to CSV/JSON?
12-29-2020 04:47 AM
Hi @enelias,
Export to CSV is supported in apoc. But, I understand that you are using neo4j 2.x, so it can't be used.
Each node will have different properties, and so is relationships. You can execute each query in the browser and export to csv, but that is NOT a great idea (due to volume etc etc).
I have created a python program that utilizes pandas to load the data in a dataframe, more dynamically.
Python -> neo4j/neo4j_pivot.ipynb at master · dominicvivek06/neo4j · GitHub
But this doesn't export to nodes or relationships to csv file. I can create another python program for you that can dynamically export to csv file. Let me know if you need further assistance.
All the sessions of the conference are now available online