Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-16-2022 03:13 PM
I have a dataframe which I wish to transfer from python to Neo4j. My dataframe looks like below.
I want the text column to be connected via Next relationship. Something like below.
I know the Cypher query. My requirement is I want the POS column rows attached as a property to each word. Example Node Dog has has POS NOUN so NOUN should be attached as a property to that node and the NEXT relationship should be maintained as shown above.
How can I write the query in python notebook and see the same results in Neo4j graph? Please assist me with the syntax as I am pretty new to Neo4j and Cypher?
Solved! Go to Solution.
05-30-2022 04:18 AM
The load_df will not work for you in this specific use case. Solved your problem as follows. Note that you need to install apoc library to make it work:
import spacy
import en_core_web_sm
import pandas as pd
nlp = spacy.load("en_core_web_sm")
text = "The wild is dangerous"
doc = nlp(text)
cols = ("text", "POS")
rows = []
for t in doc:
row = [t.text, t.pos_]
rows.append(row)
df = pd.DataFrame(rows, columns = cols)
#clean-up
db.query("MATCH (w:Word) detach delete w")
db.create_index("Word", "index")
q = """
UNWIND $data as row
MERGE (w:Word{text:row.text})
ON CREATE SET w.POS = [row.POS]
ON MATCH SET w.POS = CASE WHEN row.POS in w.POS THEN w.POS ELSE w.POS + [row.POS] END
WITH collect(w) as coll
WITH apoc.coll.pairsMin(coll) as pairs
UNWIND pairs as pair
WITH pair[0] as node1, pair[1] as node2
MERGE (node1)-[:NEXT]->(node2)
"""
db.query(q, {'data': df.to_dict(orient='records')})
text = "The rockstar is wild"
doc = nlp(text)
cols = ("text", "POS")
rows = []
for t in doc:
row = [t.text, t.pos_]
rows.append(row)
df_2 = pd.DataFrame(rows, columns = cols)
db.query(q, {'data': df_2.to_dict(orient='records')})
05-18-2022 02:07 AM
Here is what I have done so far.
I have py2neo and neo4j installed in my PC.
I wish to run the cypher query from my python notebook and the changes should reflect in NEO4j graph
Already know some basic like
import pandas as pd
from py2neo import Graph,Node,Relationship
from neo4j import GraphDatabase, basic_auth
graph = Graph("http://localhost:7474/browser/", auth=("neo4j", "*****"))
for index, row in df.iterrows():
tx = graph.begin()
tx.evaluate('''cypher query goes here''')
tx.commit()
From python notebook by using a dataframe putting the value of second column POS as property and maintaining the Next relationship in the first column as shown above
05-18-2022 06:08 AM
How about this.
I just added pos.
WITH split(tolower("His dog eats turkey on Tuesday")," ") AS text,
split("PRON NOUN VERB PROPN ADP PROPN"," ") AS pos
UNWIND range(0,size(text)-2) AS i
MERGE (w1:Word {name: text[i], pos: pos[i]})
MERGE (w2:Word {name: text[i+1], pos: pos[i+1]})
MERGE (w1)-[:NEXT]->(w2)
RETURN w1, w2
05-18-2022 06:17 AM
@koji No, I am not looking for this. The challenge I am facing is on jupyter notebook how do I perform the same operation via pandas dataframe from python notebook
import pandas as pd
from py2neo import Graph,Node,Relationship
from neo4j import GraphDatabase, basic_auth
graph = Graph("http://localhost:7474/browser/", auth=("neo4j", "*****"))
for index, row in df.iterrows():
tx = graph.begin()
tx.evaluate('''cypher query goes here''')
tx.commit()
05-23-2022 05:35 AM
neointerface package has load_df method, however in order to create NEXT relationships btw words you need to load the dataframe with index column and run an additional query. try something like:
#pip install neointerface
import neointerface
import pandas as pd
db = neointerface.NeoInterface(host="neo4j://localhost:7687" , credentials=("neo4j", "YOUR_NEO4J_PASSWORD"))
df = pd.DataFrame(...)
db.load_df(df.reset_index(), label="Word", merge=False)
db.create_index("Word", "index")
db.query("MATCH (w1:Word), (w2:Word) WHERE w2.index = w1.index + 1 MERGE (w1)-[:NEXT]->(w2)")
05-29-2022 10:37 AM
First of all thank you for your response I was able to use "neointerface".
But my goal is still not achieved. Here is what I have tried.
Then
Output:
Ideally the node the and "The" should have been created once but they were created twice.
example
and
What I am looking for is there should be no duplicate text
The POS against each text word should be created as a list or array or collection anything
Example: "text" : "wild" (should only be one node)
"POS": ["NOUN", "ADJ"]
05-30-2022 04:18 AM
The load_df will not work for you in this specific use case. Solved your problem as follows. Note that you need to install apoc library to make it work:
import spacy
import en_core_web_sm
import pandas as pd
nlp = spacy.load("en_core_web_sm")
text = "The wild is dangerous"
doc = nlp(text)
cols = ("text", "POS")
rows = []
for t in doc:
row = [t.text, t.pos_]
rows.append(row)
df = pd.DataFrame(rows, columns = cols)
#clean-up
db.query("MATCH (w:Word) detach delete w")
db.create_index("Word", "index")
q = """
UNWIND $data as row
MERGE (w:Word{text:row.text})
ON CREATE SET w.POS = [row.POS]
ON MATCH SET w.POS = CASE WHEN row.POS in w.POS THEN w.POS ELSE w.POS + [row.POS] END
WITH collect(w) as coll
WITH apoc.coll.pairsMin(coll) as pairs
UNWIND pairs as pair
WITH pair[0] as node1, pair[1] as node2
MERGE (node1)-[:NEXT]->(node2)
"""
db.query(q, {'data': df.to_dict(orient='records')})
text = "The rockstar is wild"
doc = nlp(text)
cols = ("text", "POS")
rows = []
for t in doc:
row = [t.text, t.pos_]
rows.append(row)
df_2 = pd.DataFrame(rows, columns = cols)
db.query(q, {'data': df_2.to_dict(orient='records')})
06-08-2022 02:09 PM - edited 06-08-2022 02:09 PM
Thank you so much the solution works for my use case. Appreciate your help @paltusplintus
All the sessions of the conference are now available online