Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-07-2021 10:05 AM
NEO4j Community server 4.3.3 on Ubuntu 20.04
I'm going crazy with this! I have the following string:
The string can contains everything, and the pattern [:xx]
is used as separator.
I'm trying to use
WITH '[:it]Per me “Riserva” significa tradizione. Perché è espressione massima.
[:en]For me “Riserva” means tradition. Maximum expression of.' AS str
RETURN apoc.text.regexGroups(str,
'\[:it]((([\S \r\n]*)))\[:en]((([\S \r\n]*)))'
) AS output;
which works on https://regex101.com/
(returning exactly the two substrings as expected) but on neo4j it returns
[["[:it]Per me “Riserva” significa tradizione. Perché è espressione massima.[:en]For me “Riserva” means tradition. Maximum expression of.", "Per me “Riserva” significa tradizione. Perché è espressione massima.", "Per me “Riserva” significa tradizione. Perché è espressione massima.", "Per me “Riserva” significa tradizione. Perché è espressione massima.", "For me “Riserva” means tradition. Maximum expression of.", "For me “Riserva” means tradition. Maximum expression of.", "For me “Riserva” means tradition. Maximum expression of."]]
Can someone help?
Solved! Go to Solution.
11-16-2021 06:04 PM
Hi @paolodipietro58 ,
The APOC regexGroups function returns an array with the initial text in the first element, followed by the occurrences of each group.
Besides, each group is defined within parentheses ()
, so in your case, in order to receive three elements only as an output (the text and the two occurrences), you can apply something like the following:
WITH '[:it]Per me “Riserva” significa tradizione. Perché è espressione massima.
[:en]For me “Riserva” means tradition. Maximum expression of.' AS str
RETURN apoc.text.regexGroups(str,
'\[:it\]([\S \r\n]*)\[:en\]([\S \r\n]*)'
) AS output;
which will have the following output:
[["[:it]Per me “Riserva” significa tradizione. Perché è espressione massima.
[:en]For me “Riserva” means tradition. Maximum expression of.", "Per me “Riserva” significa tradizione. Perché è espressione massima.
", "For me “Riserva” means tradition. Maximum expression of."]]
Hope this helps
11-16-2021 06:04 PM
Hi @paolodipietro58 ,
The APOC regexGroups function returns an array with the initial text in the first element, followed by the occurrences of each group.
Besides, each group is defined within parentheses ()
, so in your case, in order to receive three elements only as an output (the text and the two occurrences), you can apply something like the following:
WITH '[:it]Per me “Riserva” significa tradizione. Perché è espressione massima.
[:en]For me “Riserva” means tradition. Maximum expression of.' AS str
RETURN apoc.text.regexGroups(str,
'\[:it\]([\S \r\n]*)\[:en\]([\S \r\n]*)'
) AS output;
which will have the following output:
[["[:it]Per me “Riserva” significa tradizione. Perché è espressione massima.
[:en]For me “Riserva” means tradition. Maximum expression of.", "Per me “Riserva” significa tradizione. Perché è espressione massima.
", "For me “Riserva” means tradition. Maximum expression of."]]
Hope this helps
11-25-2021 04:40 AM
Well, so I was on the right way, but I didn't know about returning the entire string as a first occurrence!
Nice to know! Thank you.
All the sessions of the conference are now available online