Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-20-2021 06:19 AM
I'm expanding my horizon and currently fiddling around in Neo4j. Followed a few courses on Udemy and I thought I would master it enough to load a custom dataset 🙂
I want to load a PlayerUnknown's Battlegrounds dataset, the files are really big, I made a small version of it: Source: STACK
To show how the dataset looks like:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row.killed_by AS MurderWeapon, row.killer_name AS Murderer, toInteger(row.killer_placement) AS RankMurderer,
row.killer_position_x AS MurderPositionX, row.killer_position_y AS MurderPositionY,
row.map AS Map, row.match_id AS MatchID, toInteger(row.time) AS TimeOfDeathSec,
row.victim_name AS Victim, toInteger(row.victim_placement) AS RankVictim,
row.victim_position_x AS VictimPositionX, row.victim_position_y AS VictimPositionY
RETURN MurderWeapon, Murderer, RankMurderer, MurderPositionX, MurderPositionY, Map, MatchID, TimeOfDeathSec, Victim, RankVictim, VictimPositionX, VictimPositionY
LIMIT 5;
Sometimes a players kills himself or falls to his death. The killer has in this scenario null.
To work with this I thought to replace it with system- and the reason why it killed.
So I create first the nodes, Player
// Add constraint
CREATE CONSTRAINT ON (p:Player) ASSERT p.name IS UNIQUE
// Create nodes:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (:Player{name:
CASE WHEN row.killer_name IS NOT NULL
THEN row.killer_name
ELSE 'System-' + row.killed_by END
})
MERGE (:Player{name:
CASE WHEN row.victim_name IS NOT NULL
THEN row.victim_name
ELSE 'System-' + row.killed_by END
})
Now I have the players I want to created the edges, who killed who.
// Create edges:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (Player)-[killed:Killed{
`Killed With`:row.killed_by,
`KillerX`:
CASE WHEN row.killer_position_x IS NOT NULL
THEN row.killer_position_x
ELSE '0' END,
`KillerY`:
CASE WHEN row.killer_position_y IS NOT NULL
THEN row.killer_position_y
ELSE '0' END,
`Map`:row.map,
`MatchID`:row.match_id,
`Time of Death`:row.time,
`VictimX`:
CASE WHEN row.victim_position_x IS NOT NULL
THEN row.victim_position_x
ELSE '0' END,
`VictimY`:
CASE WHEN row.victim_position_y IS NOT NULL
THEN row.victim_position_y
ELSE '0' END
}]->(Player)
;
I tried this code but then the relationship points to itself. Does my relationship provide not enough information? Do I need to add the killer and victim name as well in the relationship. But if I do so I'm storing data multiple times.
I have the feeling I'm overlooking something. Do I first need to create the relationships and after the import match them?
12-20-2021 07:17 AM
The problem into your query is that the start node and the end node are the same thing.
That is, you execute:
MERGE (Player)-[killed:Killed{ ...... }]->(Player)
but without any additional information, Neo4j consider Player
the same thing,
namely a node without labels and properties.
You could join your 2 query, in this way (note that now (Player)-[killed:Killed{ ...... }]->(Player)
became (player1)-[killed:Killed{ ...... }]->(player2)
😞
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (player1:Player{name:
CASE WHEN row.killer_name IS NOT NULL
THEN row.killer_name
ELSE 'System-' + row.killed_by END
})
MERGE (player2:Player{name:
CASE WHEN row.victim_name IS NOT NULL
THEN row.victim_name
ELSE 'System-' + row.killed_by END
})
// HERE WE RETRIEVE START AND END NODES
// RESPECTIVELY WITH VARIABLE NAMES player1 AND player2
WITH player1, player2, row
MERGE (player1)-[killed:Killed{
`Killed With`:row.killed_by,
`KillerX`:
CASE WHEN row.killer_position_x IS NOT NULL
THEN row.killer_position_x
ELSE '0' END,
`KillerY`:
CASE WHEN row.killer_position_y IS NOT NULL
THEN row.killer_position_y
ELSE '0' END,
`Map`:row.map,
`MatchID`:row.match_id,
`Time of Death`:row.time,
`VictimX`:
CASE WHEN row.victim_position_x IS NOT NULL
THEN row.victim_position_x
ELSE '0' END,
`VictimY`:
CASE WHEN row.victim_position_y IS NOT NULL
THEN row.victim_position_y
ELSE '0' END
}]->(player2)
Or even, if you want to split nodes and relationships,
you could change only the 2nd query (same as above but with MATCH instead of MERGE, so that we search the nodes before, and then we create the relationships)
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MATCH (player1:Player{name:
CASE WHEN row.killer_name IS NOT NULL
THEN row.killer_name
ELSE 'System-' + row.killed_by END
})
MATCH (player2:Player{name:
CASE WHEN row.victim_name IS NOT NULL
THEN row.victim_name
ELSE 'System-' + row.killed_by END
})
WITH player1, player2, row
MERGE (player1)-[killed:Killed{
`Killed With`:row.killed_by,
`KillerX`:
CASE WHEN row.killer_position_x IS NOT NULL
THEN row.killer_position_x
ELSE '0' END,
`KillerY`:
CASE WHEN row.killer_position_y IS NOT NULL
THEN row.killer_position_y
ELSE '0' END,
`Map`:row.map,
`MatchID`:row.match_id,
`Time of Death`:row.time,
`VictimX`:
CASE WHEN row.victim_position_x IS NOT NULL
THEN row.victim_position_x
ELSE '0' END,
`VictimY`:
CASE WHEN row.victim_position_y IS NOT NULL
THEN row.victim_position_y
ELSE '0' END
}]->(player2)
All the sessions of the conference are now available online