Neo4j

mark3 · ‎05-17-2020

I am trying to run a query against a Neo4j database from an AWS Lambda function.

The Lambda appears to get a driver and session OK, but then freezes waiting for the transaction to start. Eventually the Lambda times out and returns a failure.

I have tried numerous combinations of session/driver and transaction, none of which gets past starting the transaction. I've even tried compiling with JDK 8 and 11, to no avail.

This is the simplest form I've tried:

    public LambdaResponse handleRequest(Map<String, Object> i, Context cntxt) {
        try {
            
            System.out.println("handleRequest begins...");
            System.out.println("Request id: " + cntxt.getAwsRequestId());

            Driver driver = GraphDatabase.driver("bolt://xxx.xxx.xxx.xxx:7687", AuthTokens.basic("xxxxxxxxxxxxxx", "xxxxxxxxxxxxxxxx"));
            System.out.println("Driver aquired: " + driver.toString());
            Session session = driver.session();
            System.out.println("Session aquired: " + session.toString());
            Result queryResult = session.run("MATCH (n:Version) "
                    + "RETURN date as Date, loaded as Loaded ");

            System.out.println("Processing result set...");
            while (queryResult.hasNext()) {
                Record record = queryResult.next();
                System.out.println("record " + record.toString());
            }

        } catch (Exception e) {
            e.printStackTrace(System.out);
        }

        LambdaResponse responseL = new LambdaResponse(ppGetVersion());
        responseL.setStatusCode(200);

        System.out.println("...ends");
        return responseL;
    }

The Lambda journal looks like this:

14:52:52 INFO: Direct driver instance 1032986144 created for server address xxx.xxx.xxx.xxx:7687
14:52:52 Driver aquired: org.neo4j.driver.internal.InternalDriver@3d921e20
14:52:52 Session aquired: org.neo4j.driver.internal.InternalSession@27808f31
14:53:01 END RequestId: aac2f2e1-039e-4614-9fa3-f089f5daafc6
14:53:01 REPORT RequestId: aac2f2e1-039e-4614-9fa3-f089f5daafc6 Duration: 10010.21 ms Billed Duration: 10000 ms Memory Size: 2048 MB Max Memory Used: 117 MB Init Duration: 324.79 ms
14:53:01 2020-05-17T14:53:01.609Z aac2f2e1-039e-4614-9fa3-f089f5daafc6 Task timed out after 10.01 seconds

As you can see, it never gets as far as reporting the query results before the lambda times out. There is no exception thrown and nothing logged in any of the Neo4j log files - absolutely zip.

I've tried this whith more granular code that explicitly tries to start a transaction and it never gets past that point.

This code runs perfectly well on an EC2 instance. It's only when callled from the Lambda it hangs.

I'm running on enterprise db version 4.0.0 and using the 4.0.0 java driver (in Maven)

Even if no one has an answer I would be interested to hear from anyone who has had a success with Neo4j and Java Lambdas.

Regards,

Mark

david_allen · ‎05-17-2020

The difference between EC2 and lambdas is that in the lambda case, you can't guarantee that your code is long-lived. Neo4j's drivers tend to create a connection pool which they assume will be reused by many calls, and this might not be a good fit for the lambda case.

I'd check the connection configuration parameters for the java driver and try some experimenting.

First, it looks like in your lambda that you're creating a new driver for every single function invocation. This should work, but it will be expensive and take longer. Typically serverless functions using the driver will have a driver instance stored by a shared object so that each function invocation can reuse the driver if it already exists and not create it every time. Drivers are "expensive" to create in that they require network round-tripping, authentication and so on before you can run a query, which you might not want to run every single time.

The timeout of 10 seconds is suspicious. The driver's connection timeout is 60 seconds, so it seems it might not be the driver doing it to you. Make sure to check that your lambda timeout is larger than that.

mark3 · ‎05-19-2020

Hi David,

Thanks for getting back to me with these pointers.

I've not had time yet to look at the driver connection parameters, but will do ASAP.

I had the lambda set up to preserve the connection between calls (as far as you can in a lambda) and that made no difference. I put everything into the handler to cut down the code in the question. The actual code design is significantly different, byt with the same problem.

The 10 second timeout is coming from the lambda. I had it set to 30 seconds (max time allowed by the gateway) and it still timed out waiting on the transaction. If I can't get the average response time down to a couple of seconds or less, this serverless route won't be worth persuing.

I thought it might be memory bound so bumped that up to 2G, but that didn't help either, the lambda reports max memory usage as 117Mb at timeout.

I'll have a look at the driver properties and see if there is anything in there.

Mark

dadan · ‎05-08-2021

Hi @mark3,

it would be awesome if you could share what you found. It looks like resources on running neo4j from lambda aren't easily found. Did you end up finding good connection parameters? Did you leave the driver/session or close them per lambda run?

ffoschi · ‎04-20-2021

I want to answer this thread so that someone that has my same problem can maybe take inspiration. The problem was that the lambda function couldn't communicate with the internet.
To do that you need to create a VPC and connect the lambda to a private subnet that has NAT access.

Hope this can help someone out there

Neo4j

Transaction not starting in AWS Lambda