Neo4j

Reuben · ‎02-08-2023

I am having some sort of a tricky situation, and I would like some guidance on it. I have a CSV file with same-name columns because of their unique relationships. However, when I build the graph I have relationships that return to the same node which I think should not have happened since I merged their node labels. What am I doing wrong? why are some nodes returning the relationship as illustrated below:


def similar(tx,path):
    
    tx.run (
    
    "LOAD CSV WITH HEADERS FROM $path AS row "
    "MERGE (sgd:Solar_grade {name: row.Solar_grade}) "
    "MERGE (sgd1:Solar_grade {name : row.Solar_grade}) "
    "MERGE (sgd)-[:isSimilar_Same {relationship:row.Relationship}]->(sgd1) "
 
    , path=path
    )

Brief Data from Table

Solar Grade	Relationship	Solar Grade
AC03	isSimilarTo	BC01D
BC04	isSimilarTo	BC01D
DC05	isSimilarTo	BC01D
EC06	isSimilarTo	BC01D
F08	isSameAs	BC01D
G08F	isSameAs	BC01D
HR22	isSameAs	BC01D
I7PH	isSameAs	l17-7XC
SJ	isSameAs	l17-7XC
KSUS	isSameAs	l17-7XC
AISMI	isSimilarTo	l17-7XC
AISIN	isSimilarTo	l17-7XC
PAISI	isSimilarTo	l17-7XC
QI7PH	isSimilarTo	l17-7XC
R0OH	isSimilarTo	l17-7XC

#Cypher #CSV #python #Neo4j

glilienfield · ‎02-08-2023

That is expected behavior, as the sgd and sgd1 aliases reference the same node. As such, the relationship created will relate back to the same node. The cause of your confusion is that you think you are utilizing the two "Solar Grade" columns. What is happen is that your query is only using the value of "Solar Grade" from the second column. This occurs because when you use "load csv with headers" a map is created for each row with the column names are the map keys. Since you have a duplicate column name, the first value gets replaced when the second value is added to the row's map.

If you don't want to change the column names, you can avoid this by not using 'with headers'. This will then create an array of values for each row instead of a map. You then access the columns using an index, such as row[0], row[1], and row[2]. You will also need to skip the first row in the file since your file has header data in the first row.

You can see the behavior with the example below:

Try this instead:


def similar(tx,path):
    
    tx.run (
    
    "LOAD CSV FROM $path AS row "
    "WITH ROW SKIP 1
    "MERGE (sgd:Solar_grade {name: row[0]}) "
    "MERGE (sgd1:Solar_grade {name : row[2]}) "
    "MERGE (sgd)-[:isSimilar_Same {relationship:row[1]}]->(sgd1) "
 
    , path=path
    )

Reuben · ‎02-08-2023

@glilienfield thanks for the education on the map keys. So I tried the approach you illustrated. However, instead of creating a relationship from A-[rel]->B, it rather creates nodes from the relationship column row[1] in the table and connects them to each other, while row[0] and row[2] are not connected.

Neo4j

Self_directed relationship