Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-14-2020 10:44 AM
I'm trying to understand basic concept of neo4j. To my surprise there is not even single word in official tutorials about this. All the time same "tutorial" where you happily create some data not having the reflection where you really put them.
If I use "Create Graph" in Neo4j Dektop I create new instance of database ("graph" is the equivalent of database id relational databases?)? This database contains standalone runtime? I see it creates some directory structure similar to what comes with eo4j-community-4.0.0. I can simply copy this directory to another host and run?
One "graph" is just a bag for balls (nodes)? The only way to organize this chaos is to create relations between them? So when I need to analyze car dealers and if I have need to study connections between different type of flowers I'd better create two graphs?
Ne4j use one direction relations (for instance 'town=>dealer=>brand') or relations must be created in both directions?
Regards,
G
Solved! Go to Solution.
02-19-2020 11:39 AM
Good questions, especially relevant with the newly-released 4.0. I can supplement some answers.
Historically, with Neo4j, database and graph were interchangeable, in that for a Neo4j instance you could only run a single database, and that database consisted of a single graph. As Neo4j and its tools have evolved, we need to refine these terms somewhat, and this is still ongoing, things are still rather fuzzy.
With Neo4j Desktop, it's still using "Add Graph" for each of those squares, but that's no longer quite accurate. Each of these is actually its own dbms, or instance of Neo4j (since you can choose which version of Neo4j to setup for each). They are completely independent of each other, each having a separate Neo4j instance as well as a their own data directories, and via Desktop only 1 of these can be running at a time. Via the Desktop application, you should only have visibility to the dbms level, I don't believe it currently shows info on the databases present on each.
Prior to 4.0, a dbms could only have a single database, with a single graph.
With 4.0, we introduced multi-database capability. A single dbms can now simultaneously run multiple databases. By default and at minimum, one dbms will run 2 databases: the default database (neo4j), and the system database (system).
The system database is unique in that it is only for administration, and allows a set of special administrative commands instead of regular Cypher. This is for user, role, and database administration, as well as role based security configuration.
Through the system database, new databases can be created and started. You can switch which database you're working with via special commands (in browser and cypher shell) or on session creation (via a driver).
Each database has its own graph, 1 graph per database (but that is something we're likely going to be revisiting in the future).
Each database (and even the directories per database) is separate from each other, so there isn't ever any common data between them (at least, not the same node...but there may be separate nodes that have the same property values), and relationships cannot span between databases.
Neo4j Fabric is a federated/sharded approach to querying across multiple databases. This allows you to configure a proxy database (that you will connect to and query through) as well as databases that you can query via Fabric (and these can be local or remote to the system, and they can be single instances or clusters, and none of them need any extra configuration to be able to be used by Fabric). After querying across the databases you want, you can continue to work with the combined results to get what you need.
It's up to you how you want to distribute your data, whether to use multiple databases running on the same dbms, or to separate these out to separate Neo4j instances/clusters and use Fabric to query across them (if needing queries that need to work with some combined sets of data from the separate databases).
02-14-2020 11:41 AM
Hello,
I'll see if I can start answering your questions...
For the first one, I believe it would be just the data in the directory structure, not the run time, but you could point another instance of Neo4j at that directory, and be able to access that data... someone else can correct me here.
For the second, yes, two graphs might be better, if there's nothing you want them connected by. It depends on how big each data set is though... there's nothing really stopping you from having it all in one.
For the third, relationships don't need to be created in both directions, unless it's important to your data. A reciprocal relationship between two famous people for example like so:
(person1)-[:HAS_KNOWN_OF_SINCE{:Date}]->(person2)
could have two different dates and so both directions would be there.
But for (dealer)-[:IN_TOWN]->(town) one direction would be enough, and you could query for towns or dealers, depending on what you need. I hope that helped a bit.
02-19-2020 06:52 AM
Thank you very much Oleg! Maybe you know how can I have two graphs (two databases) and switch between them?
02-19-2020 11:30 AM
This is a new feature in 4.0, but I haven't used it yet...
02-19-2020 06:59 AM
I've just checked and looks like directory that contains database created in Neo4j Dektop is database ready to run. I was able to start it
02-19-2020 11:39 AM
Good questions, especially relevant with the newly-released 4.0. I can supplement some answers.
Historically, with Neo4j, database and graph were interchangeable, in that for a Neo4j instance you could only run a single database, and that database consisted of a single graph. As Neo4j and its tools have evolved, we need to refine these terms somewhat, and this is still ongoing, things are still rather fuzzy.
With Neo4j Desktop, it's still using "Add Graph" for each of those squares, but that's no longer quite accurate. Each of these is actually its own dbms, or instance of Neo4j (since you can choose which version of Neo4j to setup for each). They are completely independent of each other, each having a separate Neo4j instance as well as a their own data directories, and via Desktop only 1 of these can be running at a time. Via the Desktop application, you should only have visibility to the dbms level, I don't believe it currently shows info on the databases present on each.
Prior to 4.0, a dbms could only have a single database, with a single graph.
With 4.0, we introduced multi-database capability. A single dbms can now simultaneously run multiple databases. By default and at minimum, one dbms will run 2 databases: the default database (neo4j), and the system database (system).
The system database is unique in that it is only for administration, and allows a set of special administrative commands instead of regular Cypher. This is for user, role, and database administration, as well as role based security configuration.
Through the system database, new databases can be created and started. You can switch which database you're working with via special commands (in browser and cypher shell) or on session creation (via a driver).
Each database has its own graph, 1 graph per database (but that is something we're likely going to be revisiting in the future).
Each database (and even the directories per database) is separate from each other, so there isn't ever any common data between them (at least, not the same node...but there may be separate nodes that have the same property values), and relationships cannot span between databases.
Neo4j Fabric is a federated/sharded approach to querying across multiple databases. This allows you to configure a proxy database (that you will connect to and query through) as well as databases that you can query via Fabric (and these can be local or remote to the system, and they can be single instances or clusters, and none of them need any extra configuration to be able to be used by Fabric). After querying across the databases you want, you can continue to work with the combined results to get what you need.
It's up to you how you want to distribute your data, whether to use multiple databases running on the same dbms, or to separate these out to separate Neo4j instances/clusters and use Fabric to query across them (if needing queries that need to work with some combined sets of data from the separate databases).
02-20-2020 02:16 AM
Thank you very much indeed you found time in your productive day to answer my question! I start to like this product.
More clear now. Just last question here (if you could answer); I cannot find any doc how to create new database (next to existing ones; neo4j & system) - any link to documentation?
For instance this documentation says nothing about this (or I'm blind):
https://neo4j.com/docs/operations-manual/current/manage-databases/introduction/
02-20-2020 12:04 PM
Hello,
In the very next session for Administration and configuration you can see the commands you can use to create a new database from the system db. All of section 5 in the operations manual docs are on managing databases.
You may also want to review section 10 on Authentication and authorization specifically the access control part, as that's how you'll configure access and grants/restrictions to your roles and users across databases.
02-25-2020 01:06 PM
And this is only possible in Enterprise edition
Thanks!
02-25-2020 01:21 PM
Yes, multi-database is an enterprise-only feature, although every Neo4j instance regardless of version will have two databases, the system
database in addition to the default.
02-25-2020 01:38 PM
And it looks like we've already clarified some of the terminology here, I missed it in the 4.0 docs:
https://neo4j.com/docs/cypher-manual/current/introduction/neo4j-databases-graphs/
We'll see if there's terminology that needs adjusting in Neo4j Desktop to match.
All the sessions of the conference are now available online