cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to pass Query Parameters in Neo4j Spark Connector

As per official guide graph can be integrated with spark directly to read and write data. I see few options like 'query', 'labels', 'relationship' etc to query a graph.

Is there a way to provide query parameters to a custom query?

 var query = " MATCH (account:Account ) " +
      " where account.appId in ['App 1', 'App 2'] " + **//PARAMETERIZE THIS INPUT**
      " return account ";

PFA for complete class.
Neo4jConnectorSpec.scala.txt (886 Bytes)

4 REPLIES 4

I have zero experience with this, but I reviewed the link you provided. In the 'script option' section within the 'custom cypher query' section, it describes setting a 'script' option before your 'query' option. the results of the script will be passed on to the query. The example shows a script returning a literal value set to 'val', which is then referenced in the 'query' using scriptResult[0].val. I am guessing the '0' index refers to the zeroth script result. In this case there is only one script executed. I assume each subsequent one may be accessed with an increasing index. If this is a correct assumption, you could get your parameters into the query by returning them in a script and then referencing them in your query with the scriptResult object.

This does seem like are convoluted way to make it work.

@glilienfield Thank you for the response, Can you also share an example for Array/List type of return value?

I tried parameterising List of string but there is no direct way to pass list as is. We need to generate a STRING as script option also accepts STRING.
- But then we can simply update the 'query' and append list

        //.option("script", "RETURN " + appIds + " AS val") // how to parameterise list?
        .option("script", "RETURN ['App 1', 'App 2'] AS val")
        .option("query", query)

Instead of trying to use the script results to inject your parameters, I though of creating the cypher script with the parameters embedded instead. I am not a Scala guy, but this seems correct from some documentation I read. I used the list's mkString() method to convert the list into a string that can be inserted into the cypher statement. The statement appIds.mkString("['", ", '", "']") should convert an appIds list as ("a", "b", "c") to ['a', 'b', 'c'], which represents a cypher literal list. This is then used to construct the cypher statement sent to spark.

Just a though.

def readFromDB(appIds: List[String]) = {

var listAsString = appIds.mkString("['", "', '", "']")

var query = " MATCH (account:Account ) " +
  " where account.appId in " + listAsString +
  " return account ";

sparkSession
  .read
  .format("org.neo4j.spark.DataSource")
  .option("query", query)
  .load()
  .toDF()

}

BTW- the documentation recommends not returning entities when using the 'query' method. Instead, return individual properties.

@poem.daga @glilienfield the spark connector doesn't let you in any way directly pass query parameters to Cypher queries. But you can still do what you want to do, by approaching the problem differently.

The native ability of the connector is to take rows in a data frame and bind them as event data, and then run cypher against that event data. So in the cypher query examples, you'll notice references to "event". Each column of a spark data frame is bound to the event so that you can refer to any aspect of the data frame in your write query. This is the way to pass query parameters -- if you need more context or metadata sent to the query, just add a column to your dataframe and pass it this way.

For read queries, there is no dataframe being sent, so this is not possible. The spark connector isn't intended though to be a full Neo4j client, but rather an integration between spark & Neo4j. If you need full client query aspects, the JAR file that comes with the spark connector already includes Neo4j's official java driver. So for example you could use Scala or Java in a spark context to use the Neo4j driver just as you would with any other application. In this way, if you have the spark connector you already have the bits you need to do query parameter passing at the driver layer.

hope this helps.