Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-18-2020 11:24 AM
First I'm rather new to graph, so please bear with me.
I want to find groups people working for groups of companies.
Or in which companies work the same employees?
Let me give an example. Below are the people working for several companies.
The final result I want is:
"companies" "collect(p.name)"
--------------------------------------------------------------------------------------------------------------------------------
["Company A","Company B"] ["Kreuk","Snijder","Hordijk","Sepers","Vrijhof","Zwijnenberg"]
["Company A","Company B","Company C"]["Kreuk","Snijder","Hordijk","Sepers","Zwijnenberg"]
["Company A","Company C"] ["Kreuk","Snijder","Hagedoorn", "Hordijk","Sepers","Zwijnenberg"]
["Company B","Company C"] ["Berden","Kreuk","Snijder","Hordijk","Sepers","Zwijnenberg"]
The data set
The code to create this example set is at the bottom.
Best effort
The quey below is my best effort, however the result is not good, because all the people that work in the company A, B, C work also together in Company A, B along with Vrijhof.
MATCH (c:Company)<-[f1:WORKS_FOR]-(p:Person)
WITH p, apoc.coll.sort(collect(c.name)) AS companies
WHERE SIZE(companies) > 1
RETURN companies, collect(p.name)
ORDER BY companies
"companies" "collect(p.name)"
["Company A","Company B"] ["Vrijhof"]
["Company A","Company B","Company C"]["Kreuk","Snijder","Hordijk","Sepers","Zwijnenberg"]
["Company A","Company C"] ["Hagedoorn"]
["Company B","Company C"] ["Berden"]
List for which companies the employees work.
MATCH (c:Company)<-[f1:WORKS_FOR]-(p:Person)
return p.name AS name, apoc.coll.sort(collect(c.name)) AS companies
ORDER BY name
"name" "companies"
---------------------------------------------
"Berden" ["Company B","Company C"]
"Dobbelaar" ["Company A"]
"Hagedoorn" ["Company A","Company C"]
"Hordijk" ["Company A","Company B","Company C"]
"Kreuk" ["Company A","Company B","Company C"]
"Sepers" ["Company A","Company B","Company C"]
"Snijder" ["Company A","Company B","Company C"]
"Ultee" ["Company C"]
"Vrijhof" ["Company A","Company B"]
"Zwijnenberg"["Company A","Company B","Company C"]
List of employees for each company
MATCH (c:Company)<-[f1:WORKS_FOR]-(p:Person)
return c.name AS company, apoc.coll.sort(collect(p.name)) AS employee
ORDER BY company
"company" "employee"
"Company A" ["Dobbelaar","Hagedoorn","Hordijk","Kreuk","Sepers","Snijder","Vrijhof","Zwijnenberg"]
"Company B" ["Berden","Hordijk","Kreuk","Sepers","Snijder","Vrijhof","Zwijnenberg"]
"Company C" ["Berden","Hagedoorn","Hordijk","Kreuk","Sepers","Snijder","Ultee","Zwijnenberg"]
CREATE (Kreuk:Person {name: "Kreuk"})
CREATE (Hordijk:Person {name: "Hordijk"})
CREATE (Sepers:Person {name: "Sepers"})
CREATE (Zwijnenberg:Person {name: "Zwijnenberg"})
CREATE (Snijder:Person {name: "Snijder"})
CREATE (Hagedoorn:Person {name: "Hagedoorn"})
CREATE (Vrijhof:Person {name: "Vrijhof"})
CREATE (Berden:Person {name: "Berden"})
CREATE (Ultee:Person {name: "Ultee"})
CREATE (Dobbelaar:Person {name: "Dobbelaar"})
CREATE (companyA:Company {name: "Company A"})
CREATE (companyB:Company {name: "Company B"})
CREATE (companyC:Company {name: "Company C"})
CREATE (Kreuk)-[:WORKS_FOR]->(companyA)
CREATE (Hordijk)-[:WORKS_FOR]->(companyB)
CREATE (Sepers)-[:WORKS_FOR]->(companyC)
CREATE (Zwijnenberg)-[:WORKS_FOR]->(companyA)
CREATE (Snijder)-[:WORKS_FOR]->(companyB)
CREATE (Hagedoorn)-[:WORKS_FOR]->(companyC)
CREATE (Vrijhof)-[:WORKS_FOR]->(companyA)
CREATE (Berden)-[:WORKS_FOR]->(companyB)
CREATE (Ultee)-[:WORKS_FOR]->(companyC)
CREATE (Dobbelaar)-[:WORKS_FOR]->(companyA)
CREATE (Kreuk)-[:WORKS_FOR]->(companyB)
CREATE (Hordijk)-[:WORKS_FOR]->(companyC)
CREATE (Sepers)-[:WORKS_FOR]->(companyA)
CREATE (Zwijnenberg)-[:WORKS_FOR]->(companyB)
CREATE (Snijder)-[:WORKS_FOR]->(companyC)
CREATE (Hagedoorn)-[:WORKS_FOR]->(companyA)
CREATE (Vrijhof)-[:WORKS_FOR]->(companyB)
CREATE (Berden)-[:WORKS_FOR]->(companyC)
CREATE (Kreuk)-[:WORKS_FOR]->(companyC)
CREATE (Hordijk)-[:WORKS_FOR]->(companyA)
CREATE (Sepers)-[:WORKS_FOR]->(companyB)
CREATE (Zwijnenberg)-[:WORKS_FOR]->(companyC)
CREATE (Snijder)-[:WORKS_FOR]->(companyA)
11-21-2020 12:28 PM
For now my solution.
Any improvements?
MATCH (c:Company)<-[f1:WORKS_FOR]-(p:Person)
WITH p, apoc.coll.sort(collect(c.name)) AS companies
WHERE SIZE(companies) > 1
WITH {companies: companies, employee: collect(p.name)} as map
WITH COLLECT(map) AS list
UNWIND list AS row
WITH row.companies AS companies,
REDUCE(l = row.employee,
i IN range(0, size(list) - 1) |
CASE WHEN apoc.coll.containsAll(list[i].companies, row.companies) THEN
l + list[i].employee
ELSE l
END) AS emps
RETURN companies, apoc.coll.toSet(apoc.coll.sort(emps)) AS employee
12-13-2020 07:48 AM
This is an overlap question, similar to "finding all movies that are in this set of genres"
Do you have any starting point for this?
One way to do it is to project this to a person-to-person network or company to company network.
And then using clustering from graph data science library to see who's in a cluster
i.e. project via cypher
nodes:
MATCH (p:Person) RETURN id(p) as id
relationships
MATCH (p1:Person)-[:WORKS_FOR]->(c:Company)<-[:WORKS_FOR]-(p2:Person)
RETURN id(p1) as source, id(p2) as target, count(*) as weight
And then run louvain or WCC on top of that projected graph.
All the sessions of the conference are now available online