Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-28-2020 10:27 AM
Hi,
I'm writing a LOAD CSV clause to import a data set from a csv file. There is a column (called ORIGINAL) including some values within this column are like:
J'irai où tu iras (I'll Go Where You Go)
I want to split the string into two or more values based on the separator "(". So I pass the clause
split(row.ORIGINAL, "(") AS titolOriginal,
As a result, I get a set of lists containing the splitted titles, like:
["J'irai où tu iras ", "I'll Go Where You Go)"]
I want to know how can I get a result like
["J'irai où tu iras", "I'll Go Where You Go"]
That is: erasing the blank at the end of the first value, and erasing the ending ")" at the end of the second value.
TIA
Miquel Centelles
12-28-2020 11:16 AM
Try this:
with "J'irai où tu iras (I'll Go Where You Go)" as p1
with replace(p1, ")", "") as p2
with split(p2, "(") AS titolOriginal
return titolOriginal
result: ["J'irai où tu iras ", "I'll Go Where You Go"]
12-28-2020 04:44 PM
A more general solution is to use APOC regular expression groups (which is similar to regular expression groups in other regex libraries).
WITH "J'irai où tu iras (I'll Go Where You Go)" AS text
WITH apoc.text.regexGroups(text, "([^(]*)\\(([^)]*)" ) AS ListOfList
with ListOfList[0] as groups
return groups[1], groups[2]
Some explanation:
([^(]*)
is group 1. The outer open/close parens surround the text to be grouped (the French text). The [^(]*
matches text includes everything up to but not including the first open paren.
\\(
indicates the literal open paren (you need to have \\ before it so it doesn't get interpreted as a grouping open paren.
([^)]*)
is group 2. The outer open/close parens surround the text to be grouped (the English text). The [^)]*
matches text includes everything up to but not including the first close paren.
I'm a little surprised/annoyed by the return value of the APOC. It's a list of a list (which is not pointed out in the documentation.)
[["J'irai où tu iras (I'll Go Where You Go","J'irai où tu iras ","I'll Go Where You Go"]]
So, you have to take the first element of the list of the list to get the list of groups. Group 0 is the full match.
You can also trim()
the leading and trailing spaces:
trim(group[1])
All the sessions of the conference are now available online