Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
09-26-2022 08:46 PM
Please I am creating somewhat of a tree diagram for columns only-from a csv file. From the code I have written so far, it worked, but then I am looking for a shorter way to execute the same thing. any suggestions.
Solved! Go to Solution.
09-27-2022 12:53 PM
The one comment I have is that creating the data-warehouse and its sub-categories is repeated for each row in the excel file. it does not change each time since it is not a function of the row data. As such, it can be moved above the 'load csv', so it is only executed once.
The rest of the query from line 13 on is correct for an import such as yours.
merge(d:Dataframe{name:"Data Ware House"})
merge(s1:Company{companyId:"Company Id"})
merge(s2:Cname{companyname:"Company Name"})
merge(s3:Clocation{companylocation:"Company Location"})
merge(s4:Cemail{companyemail:"Company email"})
merge(s5:Cbusinesstype{businesstype:"Company Business Type"})
merge(s1)-[r1:is_in]->(d)
merge(s2)-[r2:is_in]->(d)
merge(s3)-[r3:is_in]->(d)
merge(s4)-[r4:is_in]->(d)
merge(s5)-[r5:is_in]->(d)
with d,s1,s2,s3,s4,s5
load csv with headers from 'file:///company.csv'as row
with row,d,s1,s2,s3,s4,s5
where row.Id is not null and row.Location is not null
merge(ci:company{companyId:row.Id})
merge(ci)-[r1ci:available_in]->(s1)
merge(cn:cname{companyname:row.Name})
merge(cn)-[r2cn:available_in]->(s2)
merge(cL:cLocation{companyLocation:row.Location})
merge(cL)-[r3cL:available_in]->(s3)
merge(cE:cEmail{companyEmail:row.Email})
merge(cE)-[r4cE:available_in]->(s4)
merge(ct:ctype{companyEmail:row.Email})
merge(ct)-[r5ct:available_in]->(s5)
09-27-2022 06:23 PM
1. Do you want to skip the entire row if any of the row properties are null? If so, you can use a predicate like this:
where all(i in keys(row) where row[i] is not null)
2. Another option is to use the coalesce() method to set properties with a null value to a default value. https://neo4j.com/docs/cypher-manual/current/functions/scalar/#functions-coalesce
3. You can use the apoc 'do' family of methods to conditionally execute cypher statements. In your case, you could check for if the property is null and only set the property if not null.
09-28-2022 06:16 AM
That looks like it is your s1 (companyId) node. You can select which property is displayed. The selection is by node label. Click on the node. You should get the details shown on the right. Click on the node’s label shown. You will be presented with a small pop up window. You can select what gets displayed at the bottom.
09-27-2022 12:53 PM
The one comment I have is that creating the data-warehouse and its sub-categories is repeated for each row in the excel file. it does not change each time since it is not a function of the row data. As such, it can be moved above the 'load csv', so it is only executed once.
The rest of the query from line 13 on is correct for an import such as yours.
merge(d:Dataframe{name:"Data Ware House"})
merge(s1:Company{companyId:"Company Id"})
merge(s2:Cname{companyname:"Company Name"})
merge(s3:Clocation{companylocation:"Company Location"})
merge(s4:Cemail{companyemail:"Company email"})
merge(s5:Cbusinesstype{businesstype:"Company Business Type"})
merge(s1)-[r1:is_in]->(d)
merge(s2)-[r2:is_in]->(d)
merge(s3)-[r3:is_in]->(d)
merge(s4)-[r4:is_in]->(d)
merge(s5)-[r5:is_in]->(d)
with d,s1,s2,s3,s4,s5
load csv with headers from 'file:///company.csv'as row
with row,d,s1,s2,s3,s4,s5
where row.Id is not null and row.Location is not null
merge(ci:company{companyId:row.Id})
merge(ci)-[r1ci:available_in]->(s1)
merge(cn:cname{companyname:row.Name})
merge(cn)-[r2cn:available_in]->(s2)
merge(cL:cLocation{companyLocation:row.Location})
merge(cL)-[r3cL:available_in]->(s3)
merge(cE:cEmail{companyEmail:row.Email})
merge(cE)-[r4cE:available_in]->(s4)
merge(ct:ctype{companyEmail:row.Email})
merge(ct)-[r5ct:available_in]->(s5)
09-27-2022 05:01 PM
I got it, Thank You.
09-27-2022 05:50 PM
Please permit me to ask another question.
(1) Assuming I have more than 100 columns and there are many null values in each column, do I need to always use " where row. (item) is not null" for each column or there is a method similar to apply or applymap for dataframes.
(2) What causes the absence of names on the node even though it is indicated? For example with reference to the example above.
Once again Thank you @glilienfield
09-27-2022 06:23 PM
1. Do you want to skip the entire row if any of the row properties are null? If so, you can use a predicate like this:
where all(i in keys(row) where row[i] is not null)
2. Another option is to use the coalesce() method to set properties with a null value to a default value. https://neo4j.com/docs/cypher-manual/current/functions/scalar/#functions-coalesce
3. You can use the apoc 'do' family of methods to conditionally execute cypher statements. In your case, you could check for if the property is null and only set the property if not null.
09-28-2022 12:12 AM
Thanks for the clarification! Please with regards to the second question, I have attached this image. This image is from the earlier question on the query. Though I assigned names to the nodes, not all the names were displayed. So, I would like to ask what I might have missed or probably did wrong.
09-28-2022 06:16 AM
That looks like it is your s1 (companyId) node. You can select which property is displayed. The selection is by node label. Click on the node. You should get the details shown on the right. Click on the node’s label shown. You will be presented with a small pop up window. You can select what gets displayed at the bottom.
09-28-2022 06:22 AM
Thank you for the help!
All the sessions of the conference are now available online