Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-19-2021 07:01 AM
I am using the following code to import the data into neo4j 3.5 (running in a docker system):
cd /sparkwiki/helpers/
target_db="neo4j"
delim="\t"
data_dir=/wikiout_german
part_template="part-\d{5}-.*.csv.gz"
neo4j-admin import \
--database=$target_db --delimiter=$delim \
--report-file=/tmp/import-wiki.log \
--id-type=INTEGER \
--nodes:Page import/page_header.csv,"$data_dir/page/normal_pages/$part_template" \
--nodes:Page:Category import/page_header.csv,"$data_dir/page/category_pages/$part_template" \
--relationships:LINKS_TO import/pagelinks_header.csv,"$data_dir/pagelinks/$part_template" \
--relationships:BELONGS_TO import/categorylinks_header.csv,"$data_dir/categorylinks/$part_template" \
--ignore-missing-nodes
There is nothing wrong with the script, because it worked fine with neo4j 4.0 (with minor adjustment of syntax).
The process of importing the data goes without any error, and I get the following output:
IMPORT DONE in 7m 7s 93ms.
Imported:
8075624 nodes
598327030 relationships
32302496 properties
Peak memory usage: 1.35 GB
There were bad entries which were skipped and logged into /tmp/import-wiki.log
However, when I try to check the nodes using neo4j browser, I don't see any nodes there. The docker-compose code snippet for neo4j is this:
neo4jwikidevde:
build:
context: ./docker
dockerfile: neo4j/Dockerfile
environment:
- NEO4J_AUTH=neo4j/test
volumes:
- data_de:/var/lib/neo4j/data
- logs_de:/logs
- import_de:/var/lib/neo4j/import
- ./wikiout_german:/wikiout_german
networks:
- internal_t2g
- external-network
ports:
- 7475:7474
- 7688:7687
Notes:
There were some suggestions to restart the neo4j container and check after the import is complete. I have already tried that, it didn't work.
The data was getting imported in the case of neo4j 4.0, but the python code is not getting connected with neo4j in the case of version 4.0. So, I am trying with 3.5 (which is working in my other codebases).
I have tried quite a few things, nothing seems to work. Any suggestions will be highly appreciated.
01-26-2021 07:38 AM
My gut tells me you are probably connecting to the empty neo4j database in the container. If you run everything through docker (nothing on host) like I do, then there is an empty default neo4j db in the container, too. If I didn't specify the bulk loaded db by name, docker would use the empty db that was created when the docker container first started.
Have you shelled into the container to check the neo4j db folder (names, locations), see what's going where. From the node/relationship count it sounds like it is all in there.
Other thoughts. You didn't mention, but I should guess you have you carefully examined the logs? (e.g. /tmp/import-wiki.log I believe)
You are switching between 3.5 and 4.0, there are some big changes, so checking the db with the browser is a very good idea, what do you see with the browser, an empty database? or an error?
All the sessions of the conference are now available online