Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-23-2020 01:22 PM
I have created a graph with 3 paths as below.
Data:
a,b
b,c
d,e
f,e
g,h
Graph
a->b->c
d->e<-f
g->h
Desired outupt
a,uuid_1
b,uuid_1
c,uuid_1
d,uuid_2
e,uuid_2
f,uuid_2
g,uuid_3
h,uuid_3
Note : I have 50 million nodes.
Solved! Go to Solution.
02-06-2020 10:31 AM
Issue resolved..
I just added return clause at the end.
Thanks a lot for the solution.
01-23-2020 01:39 PM
Wouldn't g and h return just 78, in keeping with the pattern? You have no 'node 9' from what I can see.
Aside from that, what is the problem you're actually trying to solve? There may be simpler options.
01-23-2020 02:17 PM
Actually, in desired output second column would be any unique id(uuid) nothing like node id.
I just want to assign a unique id to each graph created in database so i can export in csv.
01-23-2020 03:21 PM
Assigning an id to each node as you create them and ensuring the ids are unique would be straightforward:
// create a unique constraint
CREATE CONSTRAINT unique_id on (n:Node) ASSERT n.id IS UNIQUE
// set id as you create nodes
CREATE (n:Node) set n.id = 1
But I still get the feeling that's not what your looking for, can you try rephrasing your question?
01-23-2020 03:37 PM
I want to assign unique id at group level not at node level, here 3 group have been created.
Each group will have separate unique id and each node of a group will share the same unique id of corresponding group. It can be added as new property of nodes.
I have made some changes in my desired output.
Thanks is advance.
01-23-2020 04:34 PM
With 50 million nodes, there must be a large number of groups too I guess?
What defines a group? I mean, what is the underlying logic you use to turn a->b->c into group 1?
Do you have group nodes with properties besides an id, or is it just a way of encapsulating the path a->b->c?
01-23-2020 05:45 PM
With 50 million nodes, there must be a large number of groups too I guess?
yes
What defines a group? I mean, what is the underlying logic you use to turn a->b->c into group 1?
yes
Do you have group nodes with properties besides an id, or is it just a way of encapsulating the path a->b->c?
What does group nodes meaning?
01-23-2020 06:28 PM
Are there any (:Group) nodes in your graph? What defines a group from an outside perspective?
You could for example have (g:Group) where g.uuid = 1, and relate nodes (a), (b) and (c) to (g) somehow:
(a)-[:BELONGS_TO]->(g)
or alternatively attach a uuid property to the relationships you already have:
(a)-[:IS_GROUPED_WITH {uuid:1}]->(b)-[:IS_GROUPED_WITH {uuid:1}]->(c)
but it really comes down to how you plan out your data model.
01-23-2020 07:31 PM
only one type of nodes I have.
My Logic:
USING PERIODIC COMMIT 5000
LOAD CSV WITH HEADERS
FROM 'file:///headers.csv' as line
MERGE (per1:person1 {person1: line.p1})
MERGE (per2:person1 {person1: line.p2})
CREATE (per1)-[:knows]->(per2)
01-23-2020 08:20 PM
Your label & property names are a bit confusing, having a 'person1' property on a 'person1' node will be hard to manage. You'll also probably have an easier time if you use the Neo4j conventions (node labels begin with uppercase letter, relationships all uppercase):
MERGE (p1:Person {id: line.p1})
MERGE (p2:Person {id: line.p2})
CREATE (p1)-[:KNOWS]->(p2)
But to try and solve your original issue - it looks like each "group" of people appears on 1 line from your CSV? If that's the case, and there is a property on each line to indicate the group number (e.g. p0), you could do:
MERGE (p1:Person {id: line.p1, groupId: line.p0})
MERGE (p2:Person {id: line.p2, groupId: line.p0})
CREATE (p1)-[:KNOWS]->(p2)
If there is no group id available, you could use apoc.load.csv to get a unique line number for each row of your csv, and make that the stand-in group id:
CALL apoc.load.csv('file:///headers.csv')
YIELD lineNo, list as line
MERGE (p1:Person {id: line.p1, groupId: lineNo})
MERGE (p2:Person {id: line.p2, groupId: lineNo})
CREATE (p1)-[:KNOWS]->(p2)
01-24-2020 04:18 AM
Thanks alot for the solution!
Actually, I have loaded the file already using neo4j import tool with relationship.
Now I just want to export data with group id as below(optimized way).
node,group_id
a,uuid_1
b,uuid_1
c,uuid_1
d,uuid_2
e,uuid_2
f,uuid_2
g,uuid_3
h,uuid_3
Logic you are suggesting would take so long time to upload.
01-26-2020 02:42 PM
If all of your nodes & relationships already exist in the graph, and you have no existing value for the group ids (they just need to be unique) you can use the apoc.path.subgraphNodes function to identify each unique cluster, and then label them with a randomly generated UUID (through another apoc function) to indicate their group:
match (p:Person) where p.groupId is null
with p, apoc.create.uuid() as newGroupId
call apoc.path.subgraphNodes(p, {relationshipFilter:"KNOWS", labelFilter:"Person"}) yield node as sibling
set p.groupId = newGroupId, sibling.groupId = newGroupId
01-29-2020 03:41 PM
Above solution working fine in small dataset.
But In case of big dataset (50 million node) its running forever.
I can load the csv again if neo4j has better option.
Thanks alot for the reply.
01-29-2020 04:05 PM
Thanks a lot, let me try this one.
02-06-2020 08:04 AM
I really appreciate for your help.
I have made minor changes in the query by replacing uuid with id of node.
In apoc.periodic.commit function what is the meaning of limit clause?
In my case Its only running for 10000 nodes only which I passed in limit size. Its supposed to be run for all nodes in batch of limit size.
call apoc.periodic.commit(
'match (p:Person) where p.groupId is null with p limit {limit}
call apoc.path.subgraphNodes(p, {relationshipFilter:"KNOWS", labelFilter:"Person"}) yield node as sibling
set p.groupId = id(p), sibling.groupId = id(p)',{limit:10000);
Thanks in advance.
02-06-2020 10:31 AM
Issue resolved..
I just added return clause at the end.
Thanks a lot for the solution.
02-06-2020 02:10 PM
I tried one more thing and this one also working fine.
But I need to check the performance of this solution.
CALL algo.unionFind('Person', 'KNOWS',
{write: true,writeProperty: 'groupId'})
yield nodes RETURN nodes
All the sessions of the conference are now available online