Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-18-2022 09:27 AM - edited 06-19-2022 10:27 AM
I started to learn Neo4j a few days ago.
I'm using it to find best path and make some analyzes.
The logic is a Person (id, name) can go to a Restaurant (id, name) via some Street (id, name). The connection between them have a cost. PS: All streets have a connection between them. For example:
(Person {id: 1})-[CONNECTION {cost:10}]->(Street {id: 1})
(Person {id: 1})-[CONNECTION {cost:11}]->(Street {id: 2})
(Street {id: 1})-[CONNECTION {cost:4}]->(Street {id: 2})
(Street {id: 2})-[CONNECTION {cost:11}]->(Restaurant {id: 1})
(Street {id: 2})-[CONNECTION {cost:7}]->(Restaurant {id: 2})
I am using Dijkstra to find all best path to all Restaurant for a specific Person. But the problem is that I can't set the maximum depth, and I would like to limit a maximum of 3 streets. How could I do that?
CALL gds.graph.project(
'Person-Street-Restaurant',
['Person', 'Street' 'Restaurant'],
'CONNECTION',
{
relationshipProperties: 'cost'
}
)
MATCH (source:Person{id:1})
CALL gds.allShortestPaths.dijkstra.stream('Person-Street-Restaurant', {
sourceNode: source,
relationshipWeightProperty: 'cost'
})
YIELD sourceNode, targetNode, totalCost, nodeIds
WHERE 'Restaurant' IN LABELS(gds.util.asNode(targetNode))
RETURN
gds.util.asNode(sourceNode).name AS sourceNodeName,
gds.util.asNode(targetNode).name AS targetNodeName,
totalCost,
[nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
SIZE(nodeIds) AS hops
ORDER BY totalCost
The best path is: P->S#3->S#4->S#5->R, total cost 7. But if I limit to 2 streets, it should be P->S#1->S#2->R, total cost 10. PS: In this example is easy because we have fewer connections, but in real case we have a lot and all streets have connection between them.
I also tried to find all paths by MATCH p1(Person)-[:CONNECTION*3..3]->(Restaurant), but no success.
06-18-2022 03:15 PM
I created this solution, but it's very slow when matching relationships 3..3.
MATCH (c: Person{id: 1}), (r: Restaurant)
MATCH p3 = (c)-[r3:CONNECTION*3..3]->(r)
WITH r, p3,
REDUCE(path = "", n IN NODES(p3) | path + n.name + "->") AS p3Path,
REDUCE(total = 0, r IN RELATIONSHIPS(p3) | total + r.cost) AS p3Total
WITH r, MIN(p3Total) AS p3TotalMin, COLLECT({path: p3Path, total: p3Total}) AS paths
UNWIND paths AS p
WITH r, p, p3TotalMin
WHERE p3TotalMin = p.total
RETURN r.name, p3TotalMin, p.path
06-19-2022 09:38 AM
Hi @lsnd ,
Interesting use case! If you ask me, 2 things should be done:
1. In your case, I may use an specific type of relationship for the link between Persons and Streets (P_CONNECTION) as well for the link between Restaurants and Streets (R_CONNECTION). If you use one type of connection, you may find your self on awful cycles when traversing long paths i.e Going from the person to the street and from that street to a different person. Yes, you can add some conditions related to the cardinality of the Persons and Restaurant nodes on a path, but It will be kinda slow.
2. Based on a scenario where you change this relationships. Can you try this pipeline on your db? (you will need apoc on your db tho)
MATCH (c: Person{id: 2})
CALL apoc.path.expandConfig(c, {
minLevel : 3,
maxLevel : 3,
relationshipFilter : 'P_CONNECTION>|CONNECTION|R_CONNECTION>',
labelFilter : "/Restaurant",
uniqueness : "NODE_PATH"
}) yield path
UNWIND relationships(path) as r
WITH r, startNode(r) as source, endNode(r) as target
WITH gds.alpha.graph.project('test', source, target, {
sourceNodeLabels: labels(source),
targetNodeLabels: labels(target)
},{
properties: r { cost : toInteger(r.cost) }
}) AS g
RETURN
g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels
MATCH (source:Person{id:2})
CALL gds.allShortestPaths.dijkstra.stream('test', {
sourceNode: source,
relationshipWeightProperty: 'cost'
})
YIELD sourceNode, targetNode, totalCost, nodeIds
WHERE 'Restaurant' IN LABELS(gds.util.asNode(targetNode))
RETURN
gds.util.asNode(sourceNode) AS sourceNodeName,
gds.util.asNode(targetNode) AS targetNodeName,
totalCost,
[nodeId IN nodeIds | gds.util.asNode(nodeId)] AS nodeNames,
SIZE(nodeIds) AS hops
ORDER BY totalCost
You can configure the number of hops valid on your run with the min and max level of your apoc expansion before the projection.
Bennu
06-19-2022 10:23 AM
Hi @bennu_neo,
Thank you so much for replying and helping.
I'm following your instructions, but what relationship would you use for Streets and Streets?
Because someone can go to a street, next to another street, then to the restaurant.
06-19-2022 10:52 AM
@bennu_neo , alright! I left connections between Street as CONNECTION, but when I try to execute your first code, to project the graph, it stuck on loading.
Maybe because I have a lot of data? Maybe because I'm testing on Neo4j Desktop?
06-19-2022 11:01 AM
Hi @lsnd!
Yes. You can let the Connection relationship between streets. What's the count of r up to line 9?
Hoe much ram u have configured for the transaction heap?
06-19-2022 11:50 AM
Hi @bennu_neo!
Unfortunately, I can't count rows up to line 9 as well. It stuck on loading.
I am using the default configuration from Neo4j.
dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=1G
dbms.memory.pagecache.size=512m
dbms.tx_state.memory_allocation=ON_HEAP
Maybe this is the problem! 😅
06-19-2022 12:50 PM
Hello @lsnd,
I like trying to fit complex problems on small ram, but yeah, if you can give more power to your db instance it can help a lot, let's try with initial and max heap size set on 2gb.
Which direction have P_CONNECTION and R_CONNECTION?
I assume they go from Person to Street and from Street to Restaurant respectively.
06-19-2022 05:39 PM
Hi @bennu_neo!
I set initial and max heap size on 2gb, but the problem persists, and when I execute the query the database seems to stop to work.
Yes, you're right! I imported them from CSV files.
LOAD CSV WITH HEADERS FROM 'file:///streets.csv' AS line
MERGE (s:Street {name: line.name, id: toInteger(line.id)})
LOAD CSV WITH HEADERS FROM 'file:///restaurants.csv' AS line
MERGE (r:Restaurant {name: line.name, id: toInteger(line.id)})
MERGE (Person{id: 1})
LOAD CSV WITH HEADERS FROM 'file:///street_costs.csv' AS line
MATCH (s1: Street {id: toInteger(line.street_src_id)})
MATCH (s2: Street {id: toInteger(line.street_dst_id)})
MERGE (s1)-[:CONNECTION {cost: toInteger(line.cost)}]->(s2)
LOAD CSV WITH HEADERS FROM 'file:///street_restaurant_costs.csv' AS line
MATCH (s: Street {id: toInteger(line.street_id)})
MATCH (r: Restaurant {id: toInteger(line.restaurant_id)})
MERGE (s)-[:R_CONNECTION {cost: toInteger(line.cost)}]->(r)
LOAD CSV WITH HEADERS FROM 'file:///person_street_costs.csv' AS line
MATCH (c: Person {id: 1})
MATCH (s: Street {id: toInteger(line.street_id)})
MERGE (c)-[:P_CONNECTION {cost: toInteger(line.cost)}]->(s)
All the sessions of the conference are now available online