Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-13-2021 02:50 PM
Hello there
It's possible to use apoc.load.csv with Google Cloud Storage by putting a jar file in the plugin folder allowing apoc.load.csv to use paths like: gs://client564/csv/media.csv
Is it possible to do it with BigQuery? If not, let's say because BigQuery store something else weirder than csv files:
I read 3 options about BigQuery
apoc.load.jdbc() : But it implies making SQL queries and then Cypher query over the same data
Storage Read API using rpc-based protocol, is there anything to use that in Neo4j?
Beginning in early Q3 2021, BigQuery Storage Read API will start charging for network egress, does Neo4j apoc has any plan for this network egress?
Thank you for your advices
Solved! Go to Solution.
05-17-2021 12:56 PM
I can't speak for APOC development plans. Usually when people are using apoc.load.csv with cloud storage, they're usually using signed URLs or public files (one or the other) rather than trying to pass a service key or credentials using APOC. At the end of the day, APOC is just loading a URL for you, the auth part is done separately (for example with signed URLs)
Now, separately -- loading CSV from storage means you're not getting the current state of the BigQuery table. To get that you might consider using the spark connector together with something like Google DataProc. This will let you live query BigQuery and transform it into a graph pattern and save that to Neo4j. This is a completely different approach that doesn't use APOC at all.
05-17-2021 12:56 PM
I can't speak for APOC development plans. Usually when people are using apoc.load.csv with cloud storage, they're usually using signed URLs or public files (one or the other) rather than trying to pass a service key or credentials using APOC. At the end of the day, APOC is just loading a URL for you, the auth part is done separately (for example with signed URLs)
Now, separately -- loading CSV from storage means you're not getting the current state of the BigQuery table. To get that you might consider using the spark connector together with something like Google DataProc. This will let you live query BigQuery and transform it into a graph pattern and save that to Neo4j. This is a completely different approach that doesn't use APOC at all.
05-17-2021 05:54 PM
Thanks
I read some tutorial about BigQuery and Neo4j and spark seems to be an option but I was trying to avoid as much new peace of technology as possible. So I can deal with the whole importation process from a single Cypher file.
I guess spark is the way to go for now
All the sessions of the conference are now available online