cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Preventing SQL injection when using neo4j full text search

Hi,

I am using Neo4j full text search index. I am playing with a query in one of the example in Christophe Willemsen's demo.
My use case is as follows - User has friends(other user nodes) and a user owns books.
Through the UI, a user searches another user by name. In the search result, I want to show users matching the name and at the top should be those users with whom I have common friends.
My query is as follows:

MATCH (User)-[:FRIENDS*1..]-(friend:User) where user.id={id}
WITH "(" + apoc.text.join(collect(friend.id), ' OR ')  + ") AS queryPart
CALL db.index.fulltext.queryNodes('searchUser', '(' + {searchName} + ' AND id:' + queryPart + ')^2 OR ' + {searchName})
YIELD node
RETURN node

I am using SDN so the above query is within @Query
List<User> searchBookOwners(@Param("id") String id, @Param("searchName") String searchName);
id and searchName are passed as parameters.

I would like to know you if the above use of string concatenating for creating search term make the code open to SQL injection attacks?
If it does, what's the best approach to achieve the desired result or implement it.

Regards,
Varun

1 ACCEPTED SOLUTION

@andrew.bowman replied to this question on slack. Pasting his response here as he was kind enough to give a detailed response.

I don't believe an attack would be possible here. When you're using parameters (or even variables) in most places of the query, there's no way to escape from that context and have it take over the context of the query (such as ending a string, closing parens, executing arbitrary Cypher, and commenting out the rest)
The exception is when you're appending into a Cypher query string, such as when using certain APOC procedures like apoc.periodic.iterate() , apoc.cypher.run() , apoc.when() , and so on
Those procedures in APOC (and similar procs when using string appending into a Cypher query string) CAN be open to Cypher injection with well-formatted inputs and some knowledge of the query it's appending into
For your query, if there was such a vulnerability, you should be able to submit a searchName parameter that includes a ' character, and that would cause a syntax error. But that doesn't happen, showing that characters in a parameter (or a variable, if you were just using a searchName variable) can't be used to escape the context (ending the string, or closing parens, etc).So any param or variable couldn't escape from the string you're constructing for the queryNodes() call.

However, the string being constructed is for a Lucene query, and to that end input could be shaped to influence that Lucene query (tweaking what results will be returned), but that won't let you execute arbitrary Cypher, and it won't let you write any changes to the graph.
Though, do be aware of what can be done with a custom Lucene query, because while this isn't Cypher injection, it is open to lucene injection
So this could return more nodes/rels from that index than expected.Take a more simple case, if we're using the movies graph (from :play movies ) and we set all :Persons as :Users (say to simulate user accounts) and we had a fulltext index on users. Then you could have a very simple query like this:

CALL db.index.fulltext.queryNodes('searchUser', {searchName})
YIELD node
return node

Looks fine. But the second parameter is a lucene query string. If the passed in parameter was 'Keanu Reeves' OR * , now the index lookup will return all nodes in that index (or of course just a simple * as the param will do the same thing)

Damage is limited...as mentioned you couldn't escape out to the Cypher query, so no Cypher injection. And you couldn't escape out of the index you're using for lookup, so that limits what data can be returned. But it is possible for a lucene query injection that will return all nodes in that index. Now if there is any additional filtering in the Cypher after this lookup, that might limit what ultimately gets returned, but it is a good idea to be aware of the inputs you're passing, and sanitize if you really need to.
(you can encase the string in backticks and it will count as string token, that can help)

View solution in original post

1 REPLY 1

@andrew.bowman replied to this question on slack. Pasting his response here as he was kind enough to give a detailed response.

I don't believe an attack would be possible here. When you're using parameters (or even variables) in most places of the query, there's no way to escape from that context and have it take over the context of the query (such as ending a string, closing parens, executing arbitrary Cypher, and commenting out the rest)
The exception is when you're appending into a Cypher query string, such as when using certain APOC procedures like apoc.periodic.iterate() , apoc.cypher.run() , apoc.when() , and so on
Those procedures in APOC (and similar procs when using string appending into a Cypher query string) CAN be open to Cypher injection with well-formatted inputs and some knowledge of the query it's appending into
For your query, if there was such a vulnerability, you should be able to submit a searchName parameter that includes a ' character, and that would cause a syntax error. But that doesn't happen, showing that characters in a parameter (or a variable, if you were just using a searchName variable) can't be used to escape the context (ending the string, or closing parens, etc).So any param or variable couldn't escape from the string you're constructing for the queryNodes() call.

However, the string being constructed is for a Lucene query, and to that end input could be shaped to influence that Lucene query (tweaking what results will be returned), but that won't let you execute arbitrary Cypher, and it won't let you write any changes to the graph.
Though, do be aware of what can be done with a custom Lucene query, because while this isn't Cypher injection, it is open to lucene injection
So this could return more nodes/rels from that index than expected.Take a more simple case, if we're using the movies graph (from :play movies ) and we set all :Persons as :Users (say to simulate user accounts) and we had a fulltext index on users. Then you could have a very simple query like this:

CALL db.index.fulltext.queryNodes('searchUser', {searchName})
YIELD node
return node

Looks fine. But the second parameter is a lucene query string. If the passed in parameter was 'Keanu Reeves' OR * , now the index lookup will return all nodes in that index (or of course just a simple * as the param will do the same thing)

Damage is limited...as mentioned you couldn't escape out to the Cypher query, so no Cypher injection. And you couldn't escape out of the index you're using for lookup, so that limits what data can be returned. But it is possible for a lucene query injection that will return all nodes in that index. Now if there is any additional filtering in the Cypher after this lookup, that might limit what ultimately gets returned, but it is a good idea to be aware of the inputs you're passing, and sanitize if you really need to.
(you can encase the string in backticks and it will count as string token, that can help)

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online