cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Tips: Evaluating neo4j in a restricted environment

jhowes
Node Link

I am evaluating neo4j for a development need at a VFX studio. The security policies we must adhere to are tight, there is only extremely restricted internet access, and developers have limited permissions and quotas to deal with. Naturally this makes evaluating ANY software challenging, but I thought it might be helpful to share my workarounds for standing up neo4j Desktop in case anyone else comes at it from a similar context.

In this case the OS is Ubuntu 18.

Disc Space

The first thing Neo4j Desktop installer does is create a directory .config/Neo4j Desktop and dump GB of data there. If your home directory is NFS and on a quota, the installer will hang and start dumping "Unknown Error" messages.

I recommend symlinking ~/.config/Neo4j Desktop to a different filesystem before starting in this case.

Installation Fault Tolerance

I found that if anything goes wrong during installation, the install state is broken. If anything goes wrong during installation, rm -rf everything in the Neo4j Desktop directory and start over. The setup process doesn't take long.

Network

Even when you choose offline mode, you may get error messages in the GUI. When Desktop starts you may see something like

An error occurred while processing Graph Application Neo4j Etl Ui ... Please check your internet connection.

These errors are not showstoppers. You can dismiss them and continue using desktop.

Plugins

We are evaluating GRAND Stack, which relies on APOC. In a network-restricted environment you cannot install plugins with the GUI. There appears to be no solid notion of $NEO4J_HOME with Desktop on U18? I know where the Neo4j Desktop directory is on my box, but it has none of the shape described in documentation around manually installing APOC

There are no directories called plugins or labs in this directory. There ARE two plugins directories buried elsewhere in the tree (e.g. Application/neo4jDatabases/database-b969c2ef-dbc6-4d56-a006-5d150de5669f/installation-4.1.0/plugins), but putting the APOC jar there does not "register" it with Desktop, even after updating the neo4j.conf file (which was also buried in the tree and not found in the home directory).

Continuing

Unfortunately I am still working on figuring out how to get APOC going. I'll let you know if I figure that out!

4 REPLIES 4

A few notes:

  • you should not use Neo4j via NFS. Most NFS implementations do not provide proper POSIX compliant locking and are therefore a pretty bad idea to be used for a database.
  • Desktop is mostly designed for connected mode - either directly or via a proxy. If you're environment is not connected at all, you probably find the tar.gz or docker installation method (see docs) easier. In any case you need to download any plugins manually and transfer them on whatever way to your instance.
  • APOC: there's a version compatibility matrix - take a look at https://neo4j.com/docs/labs/apoc/4.1/installation/
  • GraphQL: the java based plugin for graphql has been abandoned, use https://grandstack.io/docs/neo4j-graphql-js/ instead - which is part of grandstack.

(side comment: I'd immediately update my CV if being forced to work in such an enviroment but YMMV)

Hey stefan,

Thank you for pointing me to the tar gz installation suggestion, I found the documentation for it no problem! Also thanks for the APOC matrix that will definitely help.

(side comment: I work in VFX for a studio that revolutionised the film industry with some of the most incredible technicians and artists on earth. Our security commitments to the MPAA and TPN are a fact of life I'm happy to navigate in exchange for working in this amazing environment, which is a lifelong dream come true for me.)

jhowes
Node Link

I now have the complete GRANDstack running in our secured environment. Thanks @stefan.armbruster for the boost!

Standing Up Neo4j Enterprise Evaluation

As stefan points out above, just download the tarball, fill out the form, and follow the instructions. I recommend running it in the foreground with bin/neo4j console while you're setting things up as the server logs a bunch of useful information to stdout. It even lists all the essential directories you'll be interested in.

I also recommend setting a $NEO4J_HOME envvar in your shell. Super convenient.

Installing APOC is easy from this point. I'll copy the instructions here for convenience:

  • Consult the compatibility matrix through the link in stefan's message and download the appropriate JAR and copy it to the $NEO4J_HOME/plugins directory.
  • Put the jar in the plugins directory.
  • GRANDstack needs an APOC permission apoc.schema.* not typically listed with APOC instructions. Here's what I used in neo4j.conf:
    dbms.security.procedures.unrestricted=apoc.trigger.*,apoc.meta.*,apoc.schema.*
  • At the bottom of the neo4j.conf you'll need the following for later: apoc.import.file.enabled=true
  • start or restart neo4j

The service should start with no errors. If you do get an error there's a good chance you're using a version of APOC incompatible with your Neo4j version (not that I made that mistake!! 🙂 ).

Create A New Database And User

The GRANDstack setup OOTB is configured to use the neo4j user and the neo4j database. In both desktop and EE this caused the grandstack starter to fail at startup for me. I created a new database and a new user and that worked.

In the neo4j web UI:

$ CREATE DATABASE grandtest
$ CREATE USER granddad SET PASSWORD 'grandpassword' CHANGE NOT REQUIRED SET STATUS ACTIVE
$ GRANT ROLE architect TO granddad

Now in the GRANDstack project directory open api/.env and make sure you have the settings below:

NEO4J_USER=granddad
NEO4J_PASSWORD=grandpassword
NEO4J_DATABASE=grandtest

Seed the Database

The GRANDstack's database seed operation will fail without access to the open internet, as it needs to start by downloading a CSV file from neo4jlabs.com. You will need to seed the database manually.

I got around this by having neo4j Desktop on my personal machine on an open network and seeding as instructed by grandstack-starter:

cd api
npm run seedDb

Once the database on my personal machine was seeded, I needed to export it. In order to export a complete neo4j database, you need to use APOC to export a cypher file (which is super small for the GRANDstack test data). First you need to give APOC permission to write to your local filesystem. For me, that meant going to neo4j Desktop, clicking the "..." on my GRAND database, and going to Manage | Settings. This brings up a tidy little text editor with your neo4j.conf file right there. Go to the very bottom of the file and paste:

apoc.export.file.enabled=true

Then in the neo4j browser window for your database, run the following Cypher query:

CALL apoc.export.cypher.all("all.cypher", {
    format: "cypher-shell",
    useOptimizations: {type: "UNWIND_BATCH", unwindBatchSize: 20}
})
YIELD file, batches, source, format, nodes, relationships, properties, time, rows, batchSize
RETURN file, batches, source, format, nodes, relationships, properties, time, rows, batchSize

This dumps a file called all.cypher to the local filesystem. Back in the Manage screen for the database, click the Open Folder button and you'll see your database's home directory. The all.cypher is in the import directory.

Copy all.cypher to your secured internal network. Neo4j documentation recommends putting the file in $NEO4J_HOME/import for security reasons. You will import the data into your GRAND database using a utility called cypher-shell. The following command is how I imported the data:

% cd $NEO4J_HOME
% cat import/all.cypher | ./bin/cypher-shell -d grandtest -u granddad -p grandpassword

Your GRAND db is now seeded.

You Should Be Good To Go!!

With the database seeded, you should now be able to run grandstack-starter and see the application complete with seed data:

% cd <grandstack starter directory>
% npm run start

Note that while the GraphQL server comes up straight away, it might take 20-30 seconds before the webserver comes up.

Thanks for helping us get here, @stefan.armbruster! Glad to know GRANDstack will work for the highly-security-paranoid film industry 🙂

Thanks for the write up.