cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Self_directed relationship

Reuben
Graph Buddy

I am having some sort of a tricky situation, and I would like some guidance on it.  I have a CSV file with same-name columns because of their unique relationships. However, when I build the graph I have relationships that return to the same node which I think should not have happened since I merged their node labels. What am I doing wrong? why are some nodes returning the relationship as illustrated below:


def similar(tx,path):
    
    tx.run (
    
    "LOAD CSV WITH HEADERS FROM $path AS row "
    "MERGE (sgd:Solar_grade {name: row.Solar_grade}) "
    "MERGE (sgd1:Solar_grade {name : row.Solar_grade}) "
    "MERGE (sgd)-[:isSimilar_Same {relationship:row.Relationship}]->(sgd1) "
 
    , path=path
    )

illus.jpg

Brief Data from Table

Solar Grade

Relationship

Solar Grade

AC03

isSimilarTo

BC01D

BC04

isSimilarTo

BC01D

DC05

isSimilarTo

BC01D

EC06

isSimilarTo

BC01D

F08

isSameAs

BC01D

G08F

isSameAs

BC01D

HR22

isSameAs

BC01D

I7PH

isSameAs

l17-7XC

SJ

isSameAs

l17-7XC

KSUS

isSameAs

l17-7XC

AISMI

isSimilarTo

l17-7XC

AISIN

isSimilarTo

l17-7XC

PAISI

isSimilarTo

l17-7XC

QI7PH

isSimilarTo

l17-7XC

R0OH

isSimilarTo

l17-7XC

 

#Cypher #CSV #python #Neo4j

2 REPLIES 2

glilienfield
Ninja
Ninja

That is expected behavior, as the sgd and sgd1 aliases reference the same node.  As such, the relationship created will relate back to the same node. The cause of your confusion is that you think you are utilizing the two "Solar Grade" columns. What is happen is that your query is only using the value of "Solar Grade" from the second column. This occurs because when you use "load csv with headers" a map is created for each row with the column names are the map keys. Since you have a duplicate column name, the first value gets replaced when the second value is added to the row's map. 

If you don't want to change the column names, you can avoid this by not using 'with headers'. This will then create an array of values for each row instead of a map. You then access the columns using an index, such as row[0], row[1], and row[2]. You will also need to skip the first row in the file since your file has header data in the first row. 

You can see the behavior with the example below:

Screen Shot 2023-02-08 at 11.40.19 PM.png

Screen Shot 2023-02-08 at 11.38.40 PM.png

Screen Shot 2023-02-08 at 11.39.00 PM.png

Try this instead:

 


def similar(tx,path):
    
    tx.run (
    
    "LOAD CSV FROM $path AS row "
    "WITH ROW SKIP 1
    "MERGE (sgd:Solar_grade {name: row[0]}) "
    "MERGE (sgd1:Solar_grade {name : row[2]}) "
    "MERGE (sgd)-[:isSimilar_Same {relationship:row[1]}]->(sgd1) "
 
    , path=path
    )

 

Reuben
Graph Buddy

@glilienfield thanks for the education on the map keys. So I tried the approach you illustrated. However, instead of creating a relationship from A-[rel]->B, it rather creates nodes from the relationship column row[1] in the table and connects them to each other, while row[0] and row[2] are not connected.graph.png