cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Embeddings only for selected nodes.

Let me quickly describe me graph.

I have a graph that links every species by its taxon for mammals. See below small example for Hominoidea:

Kronossos_1-1659616708240.png

There are five organisms (HSA, PPS, PTR, GGO, PON) at the end of this lineage. Only organisms that are at the end of the lineage have the property of kegg=kegg_genome_id. Each of these nodes has relationships to a different node type labelled as KO (functional orthologs). See the example below just for two organisms. The same KO nodes can link to many mammalian organisms like elephant, human or a mouse (or even to all mammals),

Kronossos_0-1659616433822.png

This results in a network with 337 (111 are organisms) taxa nodes and 12142 Ko nodes and over 1,200,000 relations.

 

Now i want to build a model that would predict based on KO whenever a given species belongs toEuarchontoglires. Every organism node that is linked to Euarchontoglires has a property category=1. The rest of the organisms have the property category=0. 

This was just an introduction.

What I want to know is how I can calculate node2vec ONLY for these organism nodes. We do not want to have embeddings for KO nodes.

I have a projected graph:

graph.run("""

 

CALL gds.graph.project(
  'graph_info',
    { 
      taxa: { 
                label: 'Taxa', 
                properties: ['category']
                },
      ko: { 
                label: 'Ko'
                } 
    },

 

    {
        RELS: {
            type: 'HAS_KO',
            orientation: 'UNDIRECTED'
        }
    }
)

 

""")

I do not know how to write gds.beta.node2vec.write only for the nodes that I will later use for ML.

 

MATCH (n:Taxa) WHERE n.kegg is not null RETURN n.name, n.category, n.n2v_all_nodes

Can u guide me?

 

 

0 REPLIES 0