Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-20-2020 09:07 AM
So I have been searching and watching youtube videos on how to enhance ML with features derived from graphs and want to try it out. I have a node that is a person and that person has some properties (age, education, rent_own, etc) and I currently have have connected to one node called 'h1n1_vax_yes' and another node called 'flu_vax_yes'. Both of these connections indicate that the person has taken either one or both of those vaccines. What I am attempting to do is to use a Similarity algo to find how similar person nodes are to each for all the nodes that took the h1n1 vaccine based on the properties, which i have 35 of (similar to what is shown here at 11:00- https://www.youtube.com/watch?v=LWw94LVhfLk&list=WL&index=2&t=651s) . Looking at the examples, it shows the property of the edge being used as weight not the node properties (https://neo4j.com/docs/graph-data-science/current/alpha-algorithms/cosine/). Is there a way to do this or do I have to create a node for all 35 properties and the person? What would be a recommended approach to helping in adding more features to my dataset so that I could improve my predictions?
Perhaps my Google ninja search skills are not up to par in finding this answer....
-Using the latest GDS libs and Neo4j
-Below is basic schema
07-20-2020 10:57 AM
If you look at the examples for Cosine Similarity in the docs, you'll see an example of using node properties for similarity calculations:
MATCH (c:Cuisine)
WITH {item:id(c), weights: c.embedding} AS userData
WITH collect(userData) AS data
CALL gds.alpha.similarity.cosine.stream({
data: data,
skipValue: null
})
YIELD item1, item2, count1, count2, similarity
RETURN gds.util.asNode(item1).name AS from, gds.util.asNode(item2).name AS to, similarity
ORDER BY similarity DESC
The collect
takes the node properties (c.embedding
) and uses those to calculate similarities between cuisine
nodes.
07-22-2020 06:36 AM
Awesome! Thanks @alicia.frame I must have missed that. Is there a way to have it look at all columns without having to specify each one? In my case, I have 35 columns right now but once I OneHotEncode them, I am going to have a lot more.
Similar to how when you create a node from a csv you can tell it to put all fields as properties? Like below:
CREATE (n:Node)
SET v += row
All the sessions of the conference are now available online