Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
04-26-2022 10:17 PM
Hi
I have retail data in the pandas data frame and I am trying to load the same data into Neo4j but I am facing some challenges while doing it.
I am trying to create the properties for the node which is my column data in the pandas. There are 2 Nodes(Vendor & Product) with 1 Relationship between them but product node has many properties which we can get from the columns. I am trying to integrate the properties and columns for the Product node.
Pandas Data Frame Data :
PRODUCT_GROUP | SUB_CLASSIFICATION | VENDOR | MOLECULE_OR_TEST_TYPE | UNIT_OF_MEASURE_PER_PACK | BRAND | DOSAGE | DOSAGE_FORM | LINE_ITEM_QUANTITY | LINE_ITEM_VALUE | PACK_PRICE | UNIT_PRICE | MANUFACTURING_SITE | WEIGHT_KILOGRAMS | FREIGHT_COST_USD |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
HRDT | HIV test | RANBAXY Fine Chemicals LTD. | HIV. Reveal G3 Rapid HIV-1 Antibody Test | 30 Tests | Reveal | N/A | Test kit | 19 | 551 | 29 | 0.97 | Ranbaxy Fine Chemicals LTD | 13 | 780.34 |
ARV | Pediatric | Aurobindo Pharma Limited | Nevirapine | 240 ml | Generic | 10mg/ml | Oral suspension | 1000 | 6200 | 6.2 | 0.03 | Aurobindo Unit III. India | 358 | 4521.5 |
HRDT | HIV test | Abbott GmbH & Co. KG | HIV 1/2. Determine Complete HIV Kit | 100 Tests | Determine | N/A | Test kit | 500 | 40000 | 80 | 0.8 | ABBVIE GmbH & Co.KG Wiesbaden | 171 | 1653.78 |
ARV | Adult | SUN PHARMACEUTICAL INDUSTRIES LTD | Lamivudine | 60 Tabs | Generic | 150mg | Tablet | 31920 | 127360.8 | 3.99 | 0.07 | Ranbaxy. Paonta Shahib. India | 1855 | 16007.06 |
My Code is :
from py2neo import Graph
graph = Graph("bolt://localhost:7687", user="neo4j", password="1234")
tx = graph.begin()
transaction = graph.begin()
for index, row in df2.iterrows():
tx.evaluate('''
CREATE (b:MOLECULE_OR_TEST_TYPE {property:$MOLECULE_OR_TEST_TYPE}
{PRODUCT_GROUP:PRODUCT_GROUP ,
SUB_CLASSIFICATION:SUB_CLASSIFICATION ,
UNIT_OF_MEASURE_PER_PACK:UNIT_OF_MEASURE_PER_PACK ,
BRAND:BRAND AND DOSAGE_FORM:DOSAGE_FORM ,
LINE_ITEM_QUANTITY:LINE_ITEM_QUANTITY ,
PACK_PRICE:PACK_PRICE AND UNIT_PRICE:UNIT_PRICE ,
MANUFACTURING_SITE:MANUFACTURING_SITE ,
WEIGHT_KILOGRAMS:WEIGHT_KILOGRAMS ,
FREIGHT_COST_USD:FREIGHT_COST_USD})
MERGE (v:VENDOR {property:$VENDOR})
MERGE (b)-[:R_TARGET]->(v)
RETURN v,b
''', parameters = {'MOLECULE_OR_TEST_TYPE': str(row['MOLECULE_OR_TEST_TYPE']),
'PRODUCT_GROUP': str(row['PRODUCT_GROUP']),
'SUB_CLASSIFICATION': str(row['SUB_CLASSIFICATION']),
'UNIT_OF_MEASURE_PER_PACK': str(row['UNIT_OF_MEASURE_PER_PACK']),
'BRAND': str(row['BRAND']),'DOSAGE_FORM': str(row['DOSAGE_FORM']),
'LINE_ITEM_QUANTITY': str(row['LINE_ITEM_QUANTITY']),
'PACK_PRICE': str(row['PACK_PRICE']),
'UNIT_PRICE': str(row['UNIT_PRICE']),
'MANUFACTURING_SITE': str(row['MANUFACTURING_SITE']),
'WEIGHT_KILOGRAMS': str(row['WEIGHT_KILOGRAMS']),
'FREIGHT_COST_USD': str(row['FREIGHT_COST_USD']),
'VENDOR': str(row['VENDOR'])})
graph.commit(tx)
Please do helpful.
05-03-2022 02:38 PM
Hey there!
Maybe I can help you out!
From reading your post and looking at what you have here, it looks like you are simply trying to import this data into Neo4j (and the means of you importing is just via python).
Let me break down the import via Cypher and then we can see how that translates to using Neo4j official driver or in your case py2neo python driver.
Assumptions:
parameters={}
you had set. (This will simplify everything as well)How to Import Data Using Cypher
You have two desired nodes (Product & Vendor) and there should be a relationship that exists from Product
to Vendor
. First, let's begin by importing the Product nodes.
Per the assumption made that all properties will exist as strings, this can easily be done via following command:
LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MERGE (p:Product)
SET p += row;
This above cypher query will load a local csv file with headers, assume all columns are type string and MERGE
will perform the same as CREATE
(it just will not create anything in database that already exists).
Next we will create Vendor nodes:
LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MERGE (:Vendor {vendor : row.vendor}); // This will create `vendor` property for Vendor Node.
Now lastly we can create relationship from Product to Vendor and create a relationship HAS_VENDOR
(just an idea looking at dataset)
LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MATCH (product:Product {id : row.id })
MATCH (vendor:Vendor) WHERE vendor.vendor = row.vendor
MERGE (product)-[:HAS_VENDOR]->(vendor);
These cypher scripts can be executed using python if you'd like. It would look something like the following:
from py2neo import Graph
graph = Graph(uri, auth=(user, password))
# What the first step would look like using py2neo
for index, row in df.iterrows():
graph.run('''
LOAD CSV WITH HEADERS FROM "file:///{filename}.csv" AS row
MERGE (p:Product)
SET p += row;
''')
I hope this is helpful to you!
Cheers,
Rob
All the sessions of the conference are now available online