cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

GDS ML Node Classification givin errors on properties

I am trying to run a ML Node Classification based on the GDS plugin, but I am giving an error that I don't know where it comes from. Explanation:

  • I have created the memory graph as follows:
CALL gds.graph.create(
        "graph",
        "*",
        "*",
        {
            nodeProperties: ["bornDate", "gender"],
        }
    )

(there are 3 node labels in my database only: Persons, Cities and Test types - COVID tests- )

  • After the estimate on the ml algorithm, I run it as follows:
CALL gds.alpha.ml.nodeClassification.train(
    "graph", {
    nodeLabels: ["Person"],
    modelName: "ncModel",
    featureProperties: ["gender"],
    targetProperty: "mlNodeClassification",
    randomSeed: 2,
    holdoutFraction: 0.4,
    validationFolds: 5,
    metrics: ["F1_WEIGHTED"],
    params: [
        {penalty: 0.0625},
        {penalty: 0.5}
    ]
})

(I only use gender as demonstration purpose). But I am giving this error:

Failed to invoke procedure `gds.alpha.ml.nodeClassification.train`: Caused by: java.lang.IllegalArgumentException: The feature properties ['gender'] are not present for all requested labels. Requested labels: ['Person']. Properties available on all requested labels: []
  • That is great If not all nodes had this properties, so I check that for just in case:
MATCH (n:Person)
WHERE n.gender IS NULL
RETURN count(*)

And retrieves 0. So I am not able to understand the error shown in the second item

2 REPLIES 2

Hi.

I am wondering if there is a reason that you are including all node types in your in-memory graph? I don't imagine that Cities and Test types have genders. What happens if you repeat the node classification algorithm using just the Person nodes in your in-memory graph?

Hi @dvalcarcer!

You'll need to change a few things:

In order to not end up with orphan (disconnected) nodes with no information to learn, you'll likely need to load all your node labels and relationship types, or use a cypher projection to create a (:Person)-[]->(:Person) graph