Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-17-2019 05:01 AM
I have 2 previous topics on this site LOAD CSV
& Google Sheets and UNWIND
and apoc.periodic.iterate
where I am working with loading many files currently stored in a Google Drive Folder.
I am now at a point where I would like to "combine" these efforts. The issue I now face is that I have converted some data into google sheets from CSV files, which I did to practice using the Google Sheets method. Now though I have many more files (~50 files/folder x 20+ folders) and it doesn't seem efficient to make all of the comma-delimited ".txt" files into Google Sheets files.
I am currently stuck on being able to import the ".txt" from Google Drive. I figure I am messing up the sharing of the file properly from Google Drive.
Also, I am trying to import into a cloud instance from a Chromebook and Macbook Pro (my 2 devices).
Any thoughts?
02-17-2019 05:54 AM
Just a thought, but I sometimes use the q
tool when I'm dealing with masses of CSV. It allows you to query text/csv basically like it was a relational database with SQL directly.
In your situation what I might do is combine all of the files of like schema so that I ended up with one really big CSV file per schema type, and then LOAD CSV each of those USING PERIODIC COMMIT, this way you'd have relatively few URLs to deal with.
the other way to do it is to create a "meta manifest". If you have a pile of files in a directory you can list one per line like this:
ls -1
Then what you basically have is another CSV with a single column of the file that you want to load. So then you call apoc.load.csv("file:///path/to/manifest.csv") and you end up with a dynamic list of files, which you then feed into the LOAD CSV process.
02-17-2019 05:00 PM
@david.allen I will have to check that out. What about just importing a .csv
file from Google Drive? Is there a special trick as to what should be used for the URL in the LOAD CSV
cypher query?
02-18-2019 06:00 AM
I don't think there's any special trick, except getting Google drive to be OK with "link sharing". Doing this manually per file seems like it'd be a pain, unless it were automated with a google API or something.
All the sessions of the conference are now available online