Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-22-2020 02:22 PM
I'm running Neo4j v 4.1 and gds v1.4.
I'm trying to utilize ML tools to gain insights about a genetic genealogy graph database. It has CB_Match nodes with chromosome segment data on individuals matching one another; that is, they share segments.
I've created a virtual graph:
CALL gds.graph.create.cypher(
'myGraph',
"match (c:CB_Match) return id(c) as id",
"match (c1:CB_Match)-[r:match_by_segment{phased:'Y'}]-(c2:CB_Match) return id(c1) as source,id(c2) as target,r.cm as weight"
)
From this I can create an embedding variable for CB_Match nodes:
CALL gds.fastRP.stream('myGraph', {embeddingDimension: 4})
YIELD nodeId,embedding
with gds.util.asNode(nodeId).RN as RN,gds.util.asNode(nodeId).fullname as Name, embedding
return RN,Name,embedding order by RN,Name
I have used the write procedure to add this property to the CB_Match nodes.
Now I am trying to utilize the embedding property as described in the recent GDS anouncement, specifically neighborhood detection and visualization.
Following the documentation for KNN and its default value of {} for the configuration map, I ran the following:
CALL gds.beta.knn.stream(
'myGraph',
{ }
)
YIELD node1, node2, similarity
with gds.util.asNode(node1).fullname as Match1, gds.util.asNode(node1).fullname as Match2, similarity
return Match1, Match2, similarity limit 50
This produced an error, saying I omitted the required nodeWeightProperty from the configuration. So I added it
CALL gds.beta.knn.stream(
'myGraph',
{nodeWeightProperty:'embedding' }
)
YIELD node1, node2, similarity
with gds.util.asNode(node1).fullname as Match1, gds.util.asNode(node1).fullname as Match2, similarity
return Match1, Match2, similarity limit 50
and received an error that not every node had the embedding property ... which is not true.
Is this a bug or a problem with my logic?
10-23-2020 09:19 AM
In order to feed in the properties computed by FastRP you will need to use the mutate
mode to add them to the in-memory graph (the one you call 'myGraph'). The write
mode will only write them to Neo4j. You can reload them from Neo4j as well, but then you will have to project a new in-memory graph where you also declare the properties, and this is less efficient compared to using mutate
.
You can read more about the different execution modes here: https://neo4j.com/docs/graph-data-science/preview/common-usage/running-algos/
10-23-2020 09:35 PM
Thanks. The in memory graph I created did have the "embedding" property. It was a two step process which was less efficient as you note. But I did have the property in the 2nd iteration of the in memory graph. Yet I still got the error. So I still am puzzled by it not working. Is it a bug or my logic?
10-28-2020 08:50 PM
Hello, the error "that not every node had the embedding property" is because of the nodeQuery
doesn't contain the embedding property and hence it is absent from the in-memory graph even though it is in the Neo4j DB. You can check the documentation how to add the node property: https://neo4j.com/docs/graph-data-science/current/management-ops/cypher-projection/#cypher-projectio....
I hope this helps.
12-05-2020 10:16 AM
Your suggestion solved the initial problem. That is, the embedded property in the virtual graph now enables the KNN algo. Now I need to optimize the parameters!
All the sessions of the conference are now available online