Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-13-2021 04:54 AM
Hi all,
I have a dataset for ratings , that has ratings from 1 to 5, with different count of ratings on each ratings.
Sample file -
sample.txt (102 Bytes)
Product 1 2 3 4 5 Total
Toyota 10 15 22 40 55 142
BMW 9 91 19 12 11 142
Audi 10 9 15 20 95 149
I have individual nodes created for ratings 1 to 5.
I want to create relationships as
Node = Toyota Connected to Node = Ratings (1) with property of rating count as 10.
Node = Toyota Connected to Node = Ratings (2) with property of rating count as 15.
Node = Toyota Connected to Node = Ratings (2) with property of rating count as 22.
i,e
(Car{name:Toyota})-[:HAS_RATINGS(count:10)]->(Ratings{id:1})
Currently I am manually hardcoded the values of ratings per columns using MATCH car and MATCH ratings, which results in 5 sets of 3 MATCH statements, totalling 15 MATCH statements. The original dataset has 30,000 rows, and it takes a lot of time to load the data.
Any help is appreciated.
06-13-2021 01:50 PM
I've copied your file here.
<NEO4J_HOME>/import/sample.txt
This is my Cypher.
I added a backtick where the item name starts with a number.
Ratings are CREATED for each car, but if they are different, please change them.
LOAD CSV WITH HEADERS FROM 'file:///sample.txt' AS line
CREATE (c:Car {name: line.Product})
CREATE (r1:Ratings {id: 1})
CREATE (r2:Ratings {id: 2})
CREATE (r3:Ratings {id: 3})
CREATE (r4:Ratings {id: 4})
CREATE (r5:Ratings {id: 5})
CREATE (c)-[:HAS_RATINGS {count:line.`1`}]->(r1)
CREATE (c)-[:HAS_RATINGS {count:line.`2`}]->(r2)
CREATE (c)-[:HAS_RATINGS {count:line.`3`}]->(r3)
CREATE (c)-[:HAS_RATINGS {count:line.`4`}]->(r4)
CREATE (c)-[:HAS_RATINGS {count:line.`5`}]->(r5)
06-13-2021 05:02 PM
Thanks, but this will work for small dataset, not for huge dataset with 100K rows. The individual CREATE relationships statements has to run 5 * 100K individually to create.
I would like to iterate 1 entire row, and based on the column name, it should create the relationship with the count as property.
06-13-2021 06:31 PM
How about this one
With either Cypher, you can use the huge data as it is by simply changing the sample.txt file.
CREATE (:Ratings {id: 1}),
(:Ratings {id: 2}),
(:Ratings {id: 3}),
(:Ratings {id: 4}),
(:Ratings {id: 5});
CREATE INDEX index_rating_id FOR (n:Ratings) ON (n.id);
LOAD CSV WITH HEADERS FROM 'file:///sample.txt' AS line
MATCH (r1:Ratings {id: 1}),
(r2:Ratings {id: 2}),
(r3:Ratings {id: 3}),
(r4:Ratings {id: 4}),
(r5:Ratings {id: 5})
CREATE (c:Car {name: line.Product})
CREATE (c)-[:HAS_RATINGS {count:line.`1`}]->(r1),
(c)-[:HAS_RATINGS {count:line.`2`}]->(r2),
(c)-[:HAS_RATINGS {count:line.`3`}]->(r3),
(c)-[:HAS_RATINGS {count:line.`4`}]->(r4),
(c)-[:HAS_RATINGS {count:line.`5`}]->(r5);
However, as a graph design, it is not a good idea to have a large number of relations on a single Rating node.
06-13-2021 11:40 PM
is there any apoc procs to build the relationship more effectively ?
06-14-2021 12:25 AM
Here is my take on the data model:
1. To keep the number of nodes to minimum, you can create one node for each row. Here is the Cypher:
LOAD CSV WITH HEADERS FROM 'file:///amelia.csv' AS line
MERGE (p:Product {name:"Product"})
MERGE (c:Car {name: line.Product, rating1:toInteger(line.`1`), rating2:toInteger(line.`2`), rating3:toInteger(line.`3`), rating4:toInteger(line.`4`), rating5:toInteger(line.`5`), total: toInteger(line.Total)})
MERGE (p)-[:PRODUCT_WITH_RATINGS]->(c)
Result:
2. A slight modification on koji's response:
LOAD CSV WITH HEADERS FROM 'file:///amelia.csv' AS line
MERGE (p:Product {name:"Product"})
MERGE (c:Car {name: line.Product})
CREATE (r:TotalRatings {total: toInteger(line.Total)})
MERGE (p)-[:PRODUCT]->(c)
CREATE (c)-[:TOTAL_RATINGS]->(r)
CREATE (r1:Ratings {id: 1})
CREATE (r2:Ratings {id: 2})
CREATE (r3:Ratings {id: 3})
CREATE (r4:Ratings {id: 4})
CREATE (r5:Ratings {id: 5})
CREATE (r)-[:RATING1 {count:toInteger(line.`1`)}]->(r1)
CREATE (r)-[:RATINGS2 {count:toInteger(line.`2`)}]->(r2)
CREATE (r)-[:RATINGS3 {count:toInteger(line.`3`)}]->(r3)
CREATE (r)-[:RATINGS4 {count:toInteger(line.`4`)}]->(r4)
CREATE (r)-[:RATINGS5 {count:toInteger(line.`5`)}]->(r5)
Result:
All the sessions of the conference are now available online