cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Proper Cypher regular expression examples

tideon
Graph Buddy

I have been for weeks reading and trying to figure out how to use regular expressions in Cypher. And no the website doc has next to nothing there, also that site everyone send me to doesn't actually show neo4j cypher way of doing it.

So my question is. What is the exact code to write the equivalent in cypher of this: \s[a-zA-Z]

Can someone write some normal regular expressions with then the cypher version next to it, because the documentation has nothing that would give you any idea how to write meaningful expressions.

Kind regards,
Jeffrey

29 REPLIES 29

Hello @tideon

In the documentation, there is a little example you can transform:

MATCH (n:Person)
WHERE n.name =~ '\s[a-zA-Z]'
RETURN n.name, n.age

But I think you already tried that so:

  • can you tell us what you are trying to achieve with the regex?
  • can you tell us what the query should do?
  • can you give us your model?

Regards,
Cobra

tideon
Graph Buddy

Hello Cobra,

I think that the regex isn't working because I copy pasted the where statement above and got the following error.

Here is the query
match (t:Toy)
WHERE t.ModelNr=~ '\s[a-zA-Z]'
return t

Invalid input 's': expected '\', ''', '"', 'b', 'f', 'n', 'r', 't', UTF16 or UTF32 (line 2, column 21 (offset: 34))
"WHERE t.ModelNr=~ '\s[a-zA-Z]'"

I am attemting to find the toys where the model nr has letters in the field, so I can identify the ones that need to be cleaned up.

So lets start with this and go from there. If what you gave me doesn't work then something is fundamentally wrong.

Are you from Neo4J?

The internet is filled with people who can't get regular expression to work in Neo4j. I spend over six hours yesterday reading every post I can find on the internet, So many people cannot get a clear understanding form the manual, as it give very little explanation.

I think there is a lot of confusion of how regEx works in Neo4j

Here is a good example that highlights what i have read on multiple forums including here and stack Overflow

That very short excerpt in the manual doesn't go into what fags there are, what each one mean, how to write a query that gets back digits.

watched and learned this very good tutorial on regular expressions, but can't translate that knowledge into cypher

Many / all things are not working.

Here is another example

people refer to java, but there no robust examples of actual uses so you can see how it is actually written. Even the O'reilly "Graph databases", don't cover the topic. It is a very powerful option to have, but there is next to no attention given to it.

Yeah it's weird, I just know regex from Neo4j are from Java. No, I'm not working for Neo4j

I do not see the syntax error to be honest

Hello Cobra,

I figured it out.

All the the metacharacters need to be escaped. So "\s" needs to be "\s"

The manual doesn't address this at all.

So now I understand how to use regular expressions. Is there way to extract a part of a string that matches a criteria and put into another field?

So for instance: Technic 42020: Twin-Rotor Helicopter
That I would extract "42020" and put in a propery ModelNr

SET t.ModelNr = 42020

You should find what you need here:

I read everything there, but nothing gave the impression that I can use to achieve my goal.

Do you have an example of how it could be achieved?

I now know how to fully use Regex in Cypher, but no way to extract the pattern I have found to put it in a property field.

Thanks in advance,
Tideon

You need to use the APOC regex functions.

This is what you want:

I'm reading it documentation, this is exactly what I needed.

Hello Clem,

So I have reach so far:

// Apoc JSON ADD STORE & INVENTORY
CALL apoc.load.json("file:///lego3-5.json") YIELD value 
WITH value AS v
WITH apoc.text.regexGroups ( v.Product_Price, '\\d{1,3},\\d{1,2}' ) as price

RETURN price

Output

And I get an output of ( See attachment), but I cant use unwind to get them out of the nested list. Was at it for about 6 hours getting to this point and trying everything I can. The manual didn't explain how to use it further than the Return example.

I keep asking myself, why is the manual so bad. I never really explains as you would expect a manual to explain a subject, it is often a brief overview.

[edit because I misread the screen shot....]

Try returning price[0] to get the first element of the list of price(s) instead of the list of the single price. (European style of numbers with comma instead of decimal point.)

Hello Clem,

I uploaded a file with a sample of the data.
As you will see the price has a comma in the price ( europe ). So the RegEx is written to capture that. It is one price.

oops. My eyes are so good, so I misread it.

price[0] will give you a string. But if you want to store it as a float, you'll need to substitute the "," for a "." and then do toFloat()

the issue still remains, as to not being able to get the value out of this nested list, and I don't understand why apoc is doing that, the manual is very limited.

Hello Clem,

I missed that you said that I needed to replace the "," before I can apply the float function to the returned values. It really is the little details. I am guessing that is why I was getting "NULL" as return below.

You might experiment with this:

I think it can do what you want.... It doesn't have a lot of examples, plus you have to delve into the Java implementation of how the conversions are done to see if it work the way you want.

In particular:

RETURN apoc.number.parseFloat('12.345,67', '#,##0.00;(#,##0.00)', 'it')

returns 12345.67

Hello Clem,

This is how I solved it

// Apoc JSON ADD STORE & INVENTORY
CALL apoc.load.json("file:///lego3-5.json") YIELD value 
WITH value AS v
WITH apoc.text.regexGroups ( v.Product_Price, '\\d{1,3},\\d{1,2}' ) as price
with price[0][0] as pr
WITH REPLACE(pr,",",".") AS prnew
RETURN toFloat(prnew)

output

I do like the link you sent, really something new to learn. I find APOC to be a bit daunting at times, in that they hardly give any explanation, and being that I am learning CYPHER, I often can't get the context right away.

What is APOC really based on, is it coming from the Java point of view, so is it so that if I knew JAVA, I would quicker get the apoc stuff better/quicker?

You know what I would really like to understand is the following. What doea this mean / comes from?
" :: " the double colon. I see it in explanations in APOC and also XPATH documentation. Where does it come from, and what is it?

Java is the underlying code for Neo4J (including APOC) but that is essentially transparent. For us, Neo4J could have been programmed in anything else and 99% of time, we wouldn't be able to tell the difference. (There are some JVM related configuration parameters.)

The key thing for Neo4J, is APOC allows new functionality to be easily added to Cypher without changing Neo4J itself. That also means the APOC libraries can be updated without having to make a new Neo4J release. In addition, APOC can leverage the existing Java libraries. It's just a simple matter of putting an APOC wrapper around the Java call. That means when somebody says, "I wish Neo4J had a function to do X", it can be easily added if it already exists in the JVM world.

The :: notation is new to me but I can tell that the token after the :: is the type of a Cypher thing expected in the APOC function or procedure call, like a STRING or NODE. My guess is the ? means the argument is optional.

The weird thing is YIELD which I don't quite understand.... but I just follow the documentation.

APOC is nothing to fear: just experiment with data you don't care about, or you can do something like:

WITH '12.345,67' AS cost
RETURN apoc.number.parseFloat(cost, '#,##0.00;(#,##0.00)', 'it')

Hope that helps! (I'm somewhat new to Neo4J, but I have experience with lots of languages and systems and some with SQL, so I can make decent guesses about what's going on!)

Thank you clem,

i am new to Neo4J (2019) , and am learning Python at the same time. Last time I programmed it was with Borland Pascal. I am really liking Python, Just learned Regex last week because I needed it for this situation. Learning xPath now, by Monday I should have it under control, Xpath is a funny thing, it can do some really super complex selects, and at the same time, you can do alot with the simple stuff.

I don't see much about Neo4J and deep learning and machine learning, it's always showcased for fraud detection and network related stuff.

I come from Filemaker Pro, and reached a point where I started to have to make some crazy join tables just to do some simple queries, so I started looking around, and found Neo4J. I started to hate the query builder in Filemaker, it is cumbersome.

Neo4J has this library that might be relevant to you. It imports XML so maybe it would be better to import into Neo4J and do your queries that way.

Since you're interesting in Python, this might be interesting to you too:

https://py2neo.org/

There is some stuff about Neo4J and ML.

My suspicion is (for now) you need to export the data into something a ML algo can ingest (e.g. a table, or .CSV file.). I suspect that most ML people think in terms of tables and not graphs (yet).

I believe if somebody is motivated enough, they would create an APOC to interface with a ML algo to handle specific use cases.

I think for Fraud, Neo4J is more of a browsing and hunting tool.

Just my opinions...

Best wishes in your learning of Neo4J!

It is one price. so for instance 123,50
In europe we use comma's and I also want it to keep the comma, because all prices are written that way.

I have tweaked it in everyway I can so see if something would work.

So, there is a "display" form of the data and the "float" form if you want to do math (e.g. multiply quantity x price).

Did you try RETURN price[0]?

I think that will return the string.

Hello @tideon

Could you provide us the JSON file?

Regards,
Cobra

Hey Cobra,

Here is a sample of the data.
had to give it the .txt extension so I could upload it.
sample_lego.json.txt (389 Bytes)

clem
Graph Steward

Here's a fragment to show you what you can do (and I'm still not quite clear what you want...)

WITH [["120,24"]] as price  // Sets the variable price
 RETURN price[0] // Returns the first element in the List of Lists, which is a List

will return a list

["120,24"]
WITH [["120,24"]] as price // Sets the variable price
RETURN price[0][0] // Returns the string

will return "120,24"

Hello Clem,

Yes, I have gotten this far with the input from you and cobra. But when I try to use toFloat(Price[0][0] ).
I get all values back as NULL.

See screen shot.

So I do the following:

with price[0][0] as pr

RETURN toFloat(pr)