cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Certificate issue while retrieving open data using apoc.load.json

neo4joe
Node Clone

The following open data URL in any browser will retrieve a sample of air pollution in JSON format, and display it in your browser:

https://open.stavanger.kommune.no/api/3/action/package_search?q=luftmaling-stavanger&rows=1

I would like to include this URL in a Neo4j query to retrieve the data directly in Neo4j. So I turned to APOC. Below is a query that calls apoc.load.json, which you can paste into your Neo4j Desktop query window for testing:

WITH "https://open.stavanger.kommune.no/api/3/action/package_search?q=luftmaling-stavanger&rows=1" AS url
CALL apoc.load.json(url) YIELD value
RETURN value

It fails for me. Does it fail for you? It seems to be related to the JVM used by Neo4j and the website's self assigned certification. Here is the error message:

Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.load.json: Caused by: java.lang.RuntimeException: Can't read url or key https://open.stavanger.kommune.no/api/3/action/package_search?q=luftmaling-stavanger&rows=1 as json: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

Is there an easy work-around?

1 ACCEPTED SOLUTION

neo4joe
Node Clone

I found a work-around for using Neo4j to directly pull the data from the Open Data site, by using an alternative url. The original url I used when stating the question invokes an API provided by the organization CKAN. And that requires a certificate. But near the api link there is also a download button that fetches it in .csv format. That link does not require Neo4j to negotiate with the above API. So here is the working cypher query with the download url.

CALL apoc.load.csv('https://opencom.no/dataset/d473bc01-8d6e-4771-a946-e19d3fff3691/resource/97c4b945-6bb9-477e-a394-fae534dff195/download/nilu.csv',{sep:","}) YIELD map
RETURN map

I use apoc.load.csv instead of apoc.load.json. To see it in JSON format I just 'YIELD map'.

Okay, so it is a work-around; it does not solve the API certification problem as I defined it. Stephan's guidance is surely on the correct path for that. Thanks Stephan.

View solution in original post

6 REPLIES 6

Don't remember 100% but I think `apoc.load.json´ relies on the JVM for certifcate checking. What JVM in which version are you using?

I remember some glitches on non-existent root certificates in some JVM versions.

https://stackoverflow.com/a/25953317/158701 should point you how to use JVM's keytool to add a specific certificate.

checking on my command line...
$ java -version
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

But I don't think that's the version used by Neo4j Desktop, which I am using here. Neo4j Desktop installs Neo4j Enterprise, and Neo4j Enterprise supplies its own JDK, which I read somewhere is Zulu OpenJDK. (I can't remember the source.)

I tried finding that jdk but didn't manage to find it.
Someone else apparently tried too: Confused on version and installing Java JDK for Mac to support Enterprise for OnLine learning class

An answer to the question in that post would help answer this one. If I can find how to swap out Neo4j Enterprise's JDK I would experiment with different JDKs and the website certificate issue of this post.

You're right, Neo4j Desktop is bundled with Zulu JVM. On my Linux box it's located in $HOME/.config/Neo4j Desktop/Application/distributions/java/zulu8.21.0.1-jdk8.0.131. Inside that folder there's a bin/keytool - I guess this is the one to be used for adding certificates.

If you download Neo4 server (both enterprise and community) from https://neo4j.com/download-center/ it will use the system's JVM.

neo4joe
Node Clone

I found a work-around for using Neo4j to directly pull the data from the Open Data site, by using an alternative url. The original url I used when stating the question invokes an API provided by the organization CKAN. And that requires a certificate. But near the api link there is also a download button that fetches it in .csv format. That link does not require Neo4j to negotiate with the above API. So here is the working cypher query with the download url.

CALL apoc.load.csv('https://opencom.no/dataset/d473bc01-8d6e-4771-a946-e19d3fff3691/resource/97c4b945-6bb9-477e-a394-fae534dff195/download/nilu.csv',{sep:","}) YIELD map
RETURN map

I use apoc.load.csv instead of apoc.load.json. To see it in JSON format I just 'YIELD map'.

Okay, so it is a work-around; it does not solve the API certification problem as I defined it. Stephan's guidance is surely on the correct path for that. Thanks Stephan.

For anyone else having this issue with Neo4j Desktop Community Edition. Steps for resolving on macOS:

  1. Go to $HOME/Library/Application Support/Neo4j Desktop/ and look for cacerts.
  2. You'll probably get one for jdk and the jre. Examples:
  • $HOME/Library/Application Support/Neo4j Desktop/Application/distributions/java/zulu8.42.0.21-ca-jdk8.0.232/jre/lib/security/cacerts
  • $HOME/Library/Application Support/Neo4j Desktop/Application/distributions/java/zulu11.35.13-ca-jdk11.0.5/lib/security/cacerts

Adding the missing CA certs to those trust stores should fix the issue. Example commands, executed relative to the security directory.

../../bin/keytool -import -v -alias SOME_ALIAS -trustcacerts -keystore ./cacerts -file LOCATION_OF_CA_CERT.

I faced this error when trying to access the below page as part of a training video on the Neo4J channel. : https://raw.githubusercontent.com/cj2001/nodes2021_kg_workshop/main/json_files/wiki.json

  • The StackOverflow link talks of downloading and adding the certificate for the URL.
  • Searching google for how to save a certificate for a URL returns instructions from this link
  • However those steps assume a "padlock" icon will appear in the address bar (in Chrome), but there is no such icon
  • Tried inspecting the page in chrome but do not see any security tab, so after failing to find other options I finally decided to ask the question here even though it's an old thread. Since the error is a hurdle (and sadly a distraction )for many similar use cases

Would like to add that I am able to download the wiki.json file and import it using CALL apoc.load.json(file///<path to file>) but this doesn't really fix the problem as it fails when trying to fetch a file in another Neo4J example shown here https://neo4j.com/labs/neosemantics/tutorial/

I am unable to run

CALL n10s.rdf.preview.fetch(
  'https://raw.githubusercontent.com/neo4j-labs/neosemantics/3.5/docs/rdf/nsmntx.ttl',
  'Turtle'
)

as it has the exact same error as seen with the CALL apoc.load.json(url)