cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Importing large table from Postgresql

I have table which size is ~100gb in postgresql. When I try to import that table:

/home/user/neo4j-etl-cli-1.2.1/bin/neo4j-etl export --url jdbc:postgresql://127.0.0.1:5432/base --user user --password pass --schema myschema --fs 100 --import-tool /usr/share/neo4j/bin --csv-directory /tmp/csv --force --using cypher:neo4j-shell --neo4j:user neouser --neo4j:password neopass --destination /tmp/import/graph.db/

I get "Command failed due to error (RuntimeException: Execution of CSV export process is failed org.postgresql.util.PSQLException: Ran out of memory retrieving query results.). Rerun with --debug flag for detailed diagnostic information."

I tried experimenting with fetch-size flag and I still got "ran out of memory".
Is there any other way to import large tables to neo4j?

And how can I exclude table example_table from export?
neo4j-etl-cli-1.2.1/bin/neo4j-etl export --exclusion-mode-tables exclude --tables example_table
?

7 REPLIES 7

@michael.hunger or @albertodelazzari ay have some additional suggestions. however, the current way of retrieving the data over JDBC might not work for 100GB.

i'd suggest exporting manually from Postgres using COPY functionality:
https://www.postgresql.org/docs/current/static/sql-copy.html

there are some examples here:
http://www.postgresqltutorial.com/export-postgresql-table-to-csv-file/

after exporting, you can use neo4j-admin import to import the data into neo4j:
https://neo4j.com/docs/operations-manual/current/tutorial/import-tool/

hope this helps!

Thanks. I will have to write custom exporter to split those records in several csv files.

Could you tell me how to use exclude tables flag in neo4j-etl? I have to skip that one large table during export-import.

Hi,
you can find some useful information here.
You can use this parameter to define the exclusion policy: --exclusion-mode-tables and then use --tables to specify the list of tables to exclude/include.

Hope this can be helpful.

Thank you.
I've read etl documentation that you linked but I still can't exclude big tables from the import.
Let's say I have schema myschema in postgresql with many tables and I want to exclude two tables: firsttable and secondtable. I tried:
--rdbms:schema myschema --exclusion-mode-tables --tables firsttable secondtable
and
--rdbms:schema myschema --exclusion-mode-tables --tables myschema.firsttable myschema.secondtable

Etl ignored those options and generated csv files for both tables and imported them.

Did you run that on the command-line?

Can you share your full commandline?

Hi, the command should be this one:

--rdbms:schema myschema --exclusion-mode-tables exclude --tables firsttable secondtable

You need to specify which policy to use for the "table exclusion mode" because the default one is none, so the --tables is ignored

Let's try TALEND for injecting data into neo4j.

regards,

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online