Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-09-2020 04:14 AM
Hi,
I need help troubleshooting a rather weird error. I have setup a (single instance) Azure VM to run Neo4j following the official documentation to feed data to an Azure Databricks cluster running Spark. I connected to the Neo4j VM via HTTP on port 7474 to populate it with some data. For the Databricks cluster, I installed the connector and followed this documentation, basically just setting the connection address and login credentials as Spark parameters.
When I run a sample query via the spark connector on the Databricks cluster, I can successfully establish a connection - however, it only returns empty data:
%scala
import org.neo4j.spark._
val neo = Neo4j(sc)
# => neo: org.neo4j.spark.Neo4j = org.neo4j.spark.Neo4j@7c444d23
%scala
val rdd = neo.cypher("MATCH (n:Person) RETURN id(n) as id ").loadRowRdd
rdd.count
# => rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = Neo4jRDD partitions Partitions(1,9223372036854775807,9223372036854775807,None) MATCH (n:Person) RETURN id(n) as id using Map()
# => res1: Long = 0
the same happens for .loadDataFrame
, .loadGraphFrame
etc:
# => java.lang.RuntimeException: Cannot infer schema-types from empty result, please use loadDataFrame(schema: (String,String)*)
I can confirm that the query should in fact not return an empty DF by connecting to the remote VM from my local Neo4j Desktop and running it there:
Where is my mistake here? Thanks in advance!
(Logs and specs, see below)
Solved! Go to Solution.
11-12-2020 04:28 AM
You are using the old driver, which works in a different way and has different versions of spark that it works with and supports.
Please consider having a look at the new Neo4j connector for spark - it's easier to use, more modern, and is under active development https://neo4j.com/developer/spark/
11-12-2020 04:28 AM
You are using the old driver, which works in a different way and has different versions of spark that it works with and supports.
Please consider having a look at the new Neo4j connector for spark - it's easier to use, more modern, and is under active development https://neo4j.com/developer/spark/
All the sessions of the conference are now available online