Neo4j

drajesh · ‎04-26-2022

Hi

I have retail data in the pandas data frame and I am trying to load the same data into Neo4j but I am facing some challenges while doing it.
I am trying to create the properties for the node which is my column data in the pandas. There are 2 Nodes(Vendor & Product) with 1 Relationship between them but product node has many properties which we can get from the columns. I am trying to integrate the properties and columns for the Product node.

Pandas Data Frame Data :

PRODUCT_GROUP	SUB_CLASSIFICATION	VENDOR	MOLECULE_OR_TEST_TYPE	UNIT_OF_MEASURE_PER_PACK	BRAND	DOSAGE	DOSAGE_FORM	LINE_ITEM_QUANTITY	LINE_ITEM_VALUE	PACK_PRICE	UNIT_PRICE	MANUFACTURING_SITE	WEIGHT_KILOGRAMS	FREIGHT_COST_USD
HRDT	HIV test	RANBAXY Fine Chemicals LTD.	HIV. Reveal G3 Rapid HIV-1 Antibody Test	30 Tests	Reveal	N/A	Test kit	19	551	29	0.97	Ranbaxy Fine Chemicals LTD	13	780.34
ARV	Pediatric	Aurobindo Pharma Limited	Nevirapine	240 ml	Generic	10mg/ml	Oral suspension	1000	6200	6.2	0.03	Aurobindo Unit III. India	358	4521.5
HRDT	HIV test	Abbott GmbH & Co. KG	HIV 1/2. Determine Complete HIV Kit	100 Tests	Determine	N/A	Test kit	500	40000	80	0.8	ABBVIE GmbH & Co.KG Wiesbaden	171	1653.78
ARV	Adult	SUN PHARMACEUTICAL INDUSTRIES LTD	Lamivudine	60 Tabs	Generic	150mg	Tablet	31920	127360.8	3.99	0.07	Ranbaxy. Paonta Shahib. India	1855	16007.06

My Code is :

from py2neo import Graph
graph = Graph("bolt://localhost:7687", user="neo4j", password="1234")
tx = graph.begin()
transaction = graph.begin()
for index, row in df2.iterrows():
tx.evaluate('''
CREATE (b:MOLECULE_OR_TEST_TYPE {property:$MOLECULE_OR_TEST_TYPE}
{PRODUCT_GROUP:PRODUCT_GROUP ,
SUB_CLASSIFICATION:SUB_CLASSIFICATION ,
UNIT_OF_MEASURE_PER_PACK:UNIT_OF_MEASURE_PER_PACK ,
BRAND:BRAND AND DOSAGE_FORM:DOSAGE_FORM ,
LINE_ITEM_QUANTITY:LINE_ITEM_QUANTITY ,
PACK_PRICE:PACK_PRICE AND UNIT_PRICE:UNIT_PRICE ,
MANUFACTURING_SITE:MANUFACTURING_SITE ,
WEIGHT_KILOGRAMS:WEIGHT_KILOGRAMS ,
FREIGHT_COST_USD:FREIGHT_COST_USD})
MERGE (v:VENDOR {property:$VENDOR})
MERGE (b)-[:R_TARGET]->(v)
RETURN v,b
''', parameters = {'MOLECULE_OR_TEST_TYPE': str(row['MOLECULE_OR_TEST_TYPE']),
'PRODUCT_GROUP': str(row['PRODUCT_GROUP']),
'SUB_CLASSIFICATION': str(row['SUB_CLASSIFICATION']),
'UNIT_OF_MEASURE_PER_PACK': str(row['UNIT_OF_MEASURE_PER_PACK']),
'BRAND': str(row['BRAND']),'DOSAGE_FORM': str(row['DOSAGE_FORM']),
'LINE_ITEM_QUANTITY': str(row['LINE_ITEM_QUANTITY']),
'PACK_PRICE': str(row['PACK_PRICE']),
'UNIT_PRICE': str(row['UNIT_PRICE']),
'MANUFACTURING_SITE': str(row['MANUFACTURING_SITE']),
'WEIGHT_KILOGRAMS': str(row['WEIGHT_KILOGRAMS']),
'FREIGHT_COST_USD': str(row['FREIGHT_COST_USD']),
'VENDOR': str(row['VENDOR'])})
graph.commit(tx)

Please do helpful.

Rcolinp · ‎05-03-2022

Hey there!

Maybe I can help you out!

From reading your post and looking at what you have here, it looks like you are simply trying to import this data into Neo4j (and the means of you importing is just via python).

Let me break down the import via Cypher and then we can see how that translates to using Neo4j official driver or in your case py2neo python driver.

Assumptions:

I am going to assume that you want to import all data as strings provided the parameters={} you had set. (This will simplify everything as well)
I'll assume you are importing this table via a local file

How to Import Data Using Cypher
You have two desired nodes (Product & Vendor) and there should be a relationship that exists from Product to Vendor. First, let's begin by importing the Product nodes.

Per the assumption made that all properties will exist as strings, this can easily be done via following command:

LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MERGE (p:Product)
SET p += row;

This above cypher query will load a local csv file with headers, assume all columns are type string and MERGE will perform the same as CREATE (it just will not create anything in database that already exists).

Next we will create Vendor nodes:

LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MERGE (:Vendor {vendor : row.vendor}); // This will create `vendor` property for Vendor Node.

Now lastly we can create relationship from Product to Vendor and create a relationship HAS_VENDOR (just an idea looking at dataset)

LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MATCH (product:Product {id : row.id })
MATCH (vendor:Vendor) WHERE vendor.vendor = row.vendor
MERGE (product)-[:HAS_VENDOR]->(vendor);

These cypher scripts can be executed using python if you'd like. It would look something like the following:

from py2neo import Graph

graph = Graph(uri, auth=(user, password))
# What the first step would look like using py2neo
for index, row in df.iterrows():
    graph.run('''
    LOAD CSV WITH HEADERS FROM "file:///{filename}.csv" AS row 
    MERGE (p:Product)
    SET p += row;
    ''')

I hope this is helpful to you!

Cheers,
Rob

Neo4j

Adding the properties(Attributes) from Pandas Data frame into Neo4j using python