Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-04-2020 04:47 AM
Dear everyone,
I need your help to implement a real data quality layer in the neo4j data model in order to clean data, standardize,validate format, prepare before injecting in the database.
Here are some example:
1- input=« james et company s.à.r.l » or « james COMPANY sa.r.l »
Ouput= « James et Company sarl »
2- input= « 28/09/2020 » or « 28092020 »
Output= « 28/09/2020 »
3-input=« 12, golfstreet - 6753 frankfurt » or « golfstreet 12. 6753 frankfurt »
Output= street nbr: 12, streetname: golfstreet, zipcode: 6753, city:frankfurt
Have you already implemented this kind of process to guarantee the data quality? I think it could be necessary to have an MDM to cross Check data.
All my input data is stored in a data lake with CSv or Json format (60gb)
Thanks in advance for your help.
BR
All the sessions of the conference are now available online