Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-18-2022 10:02 PM
Hi, I created a graph with Document nodes and Word(lemmatized) nodes which is contained in the document, so Documents are connected through common Word nodes.
I projected the part of them like this:
call gds.graph.project.cypher(
'documentRootword',
'
match (n) where (n:Document and date("20220416")<=n.datePublished<=date("20220515")) or n:RootWord
return id(n) as id
',
'
match (d)-[:FROM_DOCUMENT]->(r:RootWord) where date("20220416")<=d.datePublished<=date("20220515")
return id(d) as source, id(r) as target
'
)
Then run fastRP like this:
call gds.fastRP.write(
'documentRootword',
{
embeddingDimension: 128,
iterationWeights: [1.0, 0.5, 0.5],
normalizationStrength: 0,
writeProperty: 'embedRootword',
randomSeed: 7
}
)
I know fastRP is designed for homogeneous graph. But I can usually create embeddings in heterogeous graph like this before and get useful similarities in them when I did it with GDS last year.
match (d:Document) where date("20220416")<=d.datePublished<=date("20220515")
return d.embedRootword limit 100
d.embedRootword
[0.049813542515039444, -0.033209025859832764, -0.06641805171966553, 0.03320902958512306, 0.016604511067271233, 3.2323665966060844e-9, 0.09962708503007889, -0.08302256464958191, 0.03320902958512306, 3.2323665966060844e-9, 0.14944063127040863, 3.2323665966060844e-9, 0.033209025859832764, 0.033209025859832764, 0.016604512929916382, -0.14944063127040863, -0.09962708503007889, 0.049813542515039444, -0.049813542515039444, 0.04981353506445885, 0.049813542515039444, -0.08302256464958191, 0.049813542515039444, -0.03320903703570366, -0.11623159050941467, 0.06641805171966553, 0.049813542515039444, 0.09962708503007889, 0.049813542515039444, 3.2323665966060844e-9, 0.11623159050941467, 0.23246318101882935, 0.13283610343933105, 0.06641805171966553, 0.06641805171966553, -0.033209025859832764, -0.016604512929916382, 0.0, -0.08302255719900131, -0.01660451665520668, -0.03320902958512306, -0.19925417006015778, 0.01660451665520668, 0.14944063127040863, -0.09962708503007889, -0.03320902958512306, -0.016604511067271233, -0.16604512929916382, 0.03320903703570366, 0.049813542515039444, -0.03320902958512306, 0.03320902958512306, -0.04981353506445885, 0.0830225721001625, -0.24906770884990692, -0.049813542515039444, 0.016604511067271233, 0.1826496422290802, 0.24906770884990692, 0.0830225721001625, -0.033209025859832764, 0.016604511067271233, -0.016604511067271233, 0.21585866808891296, -0.09962708503007889, -0.049813542515039444, 9.697100011862858e-9, -0.01660451665520668, -0.03320902958512306, -0.08302256464958191, 0.03320902958512306, -0.049813542515039444, 0.03320902958512306, -0.11623158305883408, -0.06641805171966553, 0.11623159050941467, 0.06641805171966553, -0.16604512929916382, 0.09962707757949829, -0.24906770884990692, -0.11623158305883408, -0.11623159050941467, -0.049813542515039444, -0.1826496422290802, 0.049813542515039444, 0.06641805171966553, 0.01660451665520668, 0.11623158305883408, -0.016604511067271233, -0.01660451665520668, 0.03320902958512306, 0.08302255719900131, 0.0, -0.049813542515039444, 0.09962708503007889, -0.049813542515039444, 0.049813542515039444, 0.033209025859832764, 0.016604511067271233, -0.03320903703570366, -3.2323665966060844e-9, 0.016604511067271233, -0.049813542515039444, 0.016604512929916382, -0.03320902958512306, 0.049813542515039444, 0.033209025859832764, -0.09962707757949829, 0.13283610343933105, 0.06641805917024612, 0.0830225721001625, 0.08302256464958191, -0.08302256464958191, -0.03320902958512306, 3.2323665966060844e-9, 0.06641805171966553, -0.14944063127040863, -0.06641805917024612, 0.0, -0.03320903703570366, 0.16604512929916382, -0.11623159050941467, -0.06641805171966553, -0.016604511067271233, -3.2323665966060844e-9, -0.033209025859832764, -0.0830225721001625, -0.08302256464958191]
....
But now with GDS 2.0.3 or 2.0.4, I cannot get the same embeddings in the memory from mutate procedure even though I can get the embeddins by Write procedure:
call gds.fastRP.mutate(
'documentRootword',
{
embeddingDimension: 128,
iterationWeights: [1.0, 0.5, 0.5],
normalizationStrength: 0,
mutateProperty: 'embedRootword',
randomSeed: 7
}
)
call gds.graph.streamNodeProperty(
'documentRootword',
'embedRootword'
)
yield nodeId, propertyValue
return nodeId, propertyValue limit 10
==> It returns all zeros for the values in the embeddings.
nodeId | propertyValue |
---|---|
25 | [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] |
27 | [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] |
...
Am I doing something wrong?
05-31-2022 03:06 AM
Hello @gigauser ,
At the first glance you queries look correct.
Do I understand you correctly, that exactly the same workflow works if you use the write
mode of FastRP instead of mutate
?
Otherwise, my first idea is that your nodes are orphan nodes with no relationships.
To check that, you can use gds.degree.stream
.
Also try to use a non-zero nodeSelfInfluence
to avoid 0 embeddings for orphan nodes? (Fast Random Projection - Neo4j Graph Data Science)
11-01-2022 02:19 AM
hi @gigauser ,
This problem has been bugging me for the past few days and I think I have found the solution for my case.
I have changed the relationship orientations to 'UNDIRECTED' when projecting a graph. This somehow fixed the issue. I guess FastRP prefers 'UNDIRECTED' graphs as it is mentioned in the docs.
All the sessions of the conference are now available online