cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4j community edition - Can it integrate with Apache Spark

Can neo4j community edition integrate with Apache Spark ?
I created a sparksession like this and created Neo4j(sc) object but when I run cypher queries, it always gives me 0 output telling that cypher returned nothing.

val Spark = SparkSession
.builder().master("local")
.config("spark.neo4j.bolt.url", "bolt://xxx.xx.xx.xxx:xxxx")
.config("spark.neo4j.bolt.user", "neo4j")
.config("spark.neo4j.bolt.password", "###")
.appName("SparkIntegrationNeo4j")
.getOrCreate()
neo = Neo4j(sc)
val rdd = neo.cypher("Query").loadRowRdd

11 REPLIES 11

Looks like you're using the old driver - there is a newer driver in beta that is used in a different way. Please have a look at this documentation for the new connector.

Thank so much for reply... we are at spark 2.4.x right now

Could you please share a link on documentation or example of using "pattern" queries like below

val graphFrame = neo.pattern(("User","id")).loadGraphFrame

Thanks
VG

We just announced the new connector today (as in, hours ago): https://neo4j.com/blog/announcing-neo4j-connector-for-apache-spark/

GraphFrames in Spark are not a thing that is going to be supported going forward, because this part of the spark environment is not well supported, and neo4j is a better environment to use for graph operations and algorithms. If you want to know how to load anything from the graph into a DataFrame, consult these documentations:

David this is what I see in neo4j website

import org.graphframes._

val graphFrame = neo.pattern(("Person","id"),("KNOWS",null), ("Person","id")).partitions(3).rows(1000).loadGraphFrame

but when I use this, it doesn't work but it gives error message like

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/graphx/VertexRDD

Thanks David, I am able to read from neo4j and create a RDD or a Dataframe...
I am trying to save dataframe into neo4j... that is where I thought graphframes would work

Thanks
VG

little confused .. latest version that I see of Apache Spark is 3.0.1 , how come this neo4j spark connector is showing it is for Apache Spark 4.0.0

Thanks
Vishwas

The documentation you're looking at that referencing graphframes is certainly the wrong, old documentation.

Neo4j Connector for Apache Spark version 4.0.0 refers to the version of the connector not the version of Spark itself. The compatibility information is listed here: https://neo4j.com/developer/spark/overview/#_spark_compatibility

Use the documentation linked in this post, which is connector version 4.0 - and as mentioned above, it does not involve the use of graphframes.

I see...Thanks David .. do you have the link for documentations where I can find how to load data into neo4j from Apache spark ...

Thanks
VG

Hey David,
I am using spark neo4j driver version "2.4.5-M1" , what are the option in this to save a dataframe into neo4j ?

Thanks
VG

@VGVG I can't help you with 2.4.5-M1, partially because I don't know. I recommend you upgrade to 4.0.0 and use the "write" instructions in the documentation that I linked to save a dataframe to Neo4j. At this point, 2.4.5 is old code that I myself don't use that much, and it won't be receiving much attention going forward.

Thanks David.. Good Morning

Could you please provide maven repository details to use this latest version ?

Thanks
VG