cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Is it possible to write a query to partition the graph

lingvisa
Graph Fellow

I want to split the graph into two sets, train and test. For each node, add an additional node label, either 'train' or 'test'. For example, 1/3 goes to 'test' and 2/3 goes to 'train'. Is is possible to do this with a cypher in neo4j?

MATCH (n)
WITH count(n) as total
UNWIND range(1,total) as range
WITH toInteger(rand() * (total - 1 + 1)) + 1 as randomID
WHERE ID(n) = randomID
SET n:Test

It should be something like this. The total needs to be divided into a test and train based on the 1:2 ratio.

1 ACCEPTED SOLUTION

We can do this, but it will be a little easier if we leverage APOC Procedures so we can add labels dynamically.

MATCH (n)
WITH n, rand() * 3 as random
WITH n, CASE WHEN random < 1 THEN 'Test' ELSE 'Train' END as label
CALL apoc.create.addLabels([n], [label]) YIELD node
RETURN count(*)

View solution in original post

2 REPLIES 2

We can do this, but it will be a little easier if we leverage APOC Procedures so we can add labels dynamically.

MATCH (n)
WITH n, rand() * 3 as random
WITH n, CASE WHEN random < 1 THEN 'Test' ELSE 'Train' END as label
CALL apoc.create.addLabels([n], [label]) YIELD node
RETURN count(*)

That's really a smart way to do this task!