Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-13-2020 06:54 AM
I have models Region, Industry, Size, which have common property "month" - a date when the specific entry was created.
We know that all the entities might share the same month.
My algorithm should do the following:
Accept a list of possible values for each entity (region, industry, size) and find their intersection based on the month.
Example:
Input: Industry=["Information Technology", "Medicine"], Region=["Europe", "USA"], Size=["Small", "Medium"]
Output: [201910,202005]
How do I create a relationship between these models? What is the most efficient way to request this intersection?
I tried to build the graph as follows, but seems it's not very optimal because requests are processed too slow:
Solved! Go to Solution.
08-14-2020 03:27 AM
Hi @miron4dev,
You may like some indexes like:
CREATE INDEX ON :Industry(industry);
CREATE INDEX ON :Region(region);
CREATE INDEX ON :Size(size);
Then, can you try this quey (sorry if it slower than yours, I hope is not 😄 )
MATCH (i:Industry)-[:OCCURRED]->(m:Month)
WHERE i.industry in ['Medicine', 'Information Technology']
WITH collect(m) as out
MATCH (r:Region)-[:OCCURRED]->(m:Month)
WHERE r.region in ['USA', 'Europe']
WITH apoc.coll.intersection(out,collect(m)) as out
MATCH (s:Size)-[:OCCURRED]->(m:Month)
WHERE s.size in ['Small', 'Medium']
WITH apoc.coll.intersection(out,collect(m)) as out
RETURN out
H
08-13-2020 02:46 PM
Hi Miron!
How is the query you tried with this data model? Have you tried collecting them all (Input nodes) and then :
WITH collect(inputs) as inp
MATCH (d:DateMonth)
WHERE ALL(i in inp WHERE (i)-[:OCCURRED]->(m))
return m
H
08-14-2020 02:09 AM
Hi @Bennu
I tried the following query:
MATCH (i:Industry)-[:OCCURRED]->(m:Month),(r:Region)-[:OCCURRED]->(m:Month),(s:Size)-[:OCCURRED]->(m:Month)
WHERE i.industry in ['Medicine', 'Information Technology'] and r.region in ["USA", "Europe"] and s.size in ["Small", "Medium"]
WITH i,r,s,m
ORDER by m.value desc
return i, r, s, collect(distinct m.value)
and it produces the expected results in 621 ms. Is there any way to optimize it somehow? Change the following structure is also an option.
I already added indexes for Industry, Region, Size, and Month, but I hope I can make it better.
08-14-2020 03:27 AM
Hi @miron4dev,
You may like some indexes like:
CREATE INDEX ON :Industry(industry);
CREATE INDEX ON :Region(region);
CREATE INDEX ON :Size(size);
Then, can you try this quey (sorry if it slower than yours, I hope is not 😄 )
MATCH (i:Industry)-[:OCCURRED]->(m:Month)
WHERE i.industry in ['Medicine', 'Information Technology']
WITH collect(m) as out
MATCH (r:Region)-[:OCCURRED]->(m:Month)
WHERE r.region in ['USA', 'Europe']
WITH apoc.coll.intersection(out,collect(m)) as out
MATCH (s:Size)-[:OCCURRED]->(m:Month)
WHERE s.size in ['Small', 'Medium']
WITH apoc.coll.intersection(out,collect(m)) as out
RETURN out
H
All the sessions of the conference are now available online