Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-09-2021 05:48 PM
Hi, I'm new at neo4j, trying to use the graphSAGE with float list property.
I'm using GDS 1.5ver, so i thought it can be work.
Firstly, i tested with the example on the graphSAGE documentation.
GraphSAGE - Neo4j Graph Data Science
I made node and relation using example cypher, created the graph.
Oh, i changed the creating person node little like (with float list)
" ( dan:Person {name: 'Dan', age: 20, heightAndWeight: [185.0, 75.0]}),"
But i got error when train with this graph.
The error message is
Failed to invoke procedure gds.beta.graphSage.train
: Caused by: org.ejml.MatrixDimensionException: 4 != 3 The 'A' and 'B' matrices do not have compatible dimensions
why does the error occur and what should i do ?
05-10-2021 09:20 AM
Hi.
Can you please do me a favor and verify what happens if you run the example as is (without changing to a list of floats)?
Thanks!
05-10-2021 10:00 PM
Hello Clair,
yes, If i run the example as is,
ex. ( dan:Person {name: 'Dan', age: 20, heightAndWeight: [185, 75]}),
I can make nodes and create the graph with the example cypher.
But when I'm trying to train graphSAGE, the error occurs like below.
Neo.ClientError.Procedure.ProcedureCallFailed
It says:
Failed to invoke procedure gds.beta.graphSage.train
: Caused by: java.lang.IllegalStateException: Unknown ValueType LONG_ARRAY
The error message says "Unknown Value Type LONG ARRAY",
i thought i need to make the type of "heightAndWeight" (which has list value) to float list.
So i fixed cypher like below, and create the graph with same code (as example).
( dan:Person {name: 'Dan', age: 20, heightAndWeight: [185.0, 75.0]})
And try to train graphSAGE with this,
I got Neo.ClientError.Procedure.ProcedureCallFailed error,
it says:
Failed to invoke procedure gds.beta.graphSage.train
: Caused by: org.ejml.MatrixDimensionException: 4 != 3 The 'A' and 'B' matrices do not have compatible dimensions
05-11-2021 07:42 AM
Thanks for the follow-up info! Let me ping some people and get back to you...
05-10-2021 10:34 AM
Hi, I'm doing something similar, although replicating the original proteins test done in the graphSAGE paper (following the directions from this tutorial). I'm able to get the model to train fine when I simply feed in each node property as a separate feature to the featureProperties
setting in gds.beta.graphSage.train()
. However, when I create a graph projection using only the property that is already a list of floats, it fails with the same MatrixDimensionException
error.
If I had to guess, likely the graphSAGE training algo is looking to the length of the featureProperties
list of strings to dictate the dimensionality of that input, instead of inspecting
the properties that that list of strings is pointing to. Note that this graph has effectively two copies of the "embeddings" list/vector stored as properties on each node: one stored as 'embeddings_all' that is a list of floats, and the other stored with each individual list element stored in node properties 'embedding_i' where i goes from 0 to 49. Here are the full code sets for what works and what doesn't:
What works, but requires each value of a list to be stored as separate properties (here called 'embedding_0' through 'embedding_49')
//Create a graph projection
UNWIND range(0,49) as i
WITH collect('embedding_' + toString(i)) as embeddings
CALL gds.graph.create('train_noList','Train',
{INTERACTS:{orientation:'UNDIRECTED'}},
{nodeProperties:embeddings})
YIELD graphName, nodeCount, relationshipCount
RETURN graphName, nodeCount, relationshipCount
Then
//Train the model
UNWIND range(0,49) as i
WITH collect('embedding_' + toString(i)) as embeddings
CALL gds.beta.graphSage.train('train_noList',{
modelName:'proteinModel',
aggregator:'pool',
batchSize:512,
activationFunction:'relu',
epochs:10,
sampleSizes:[25,10],
learningRate:0.0000001,
embeddingDimension:256,
featureProperties:embeddings,
degreeAsProperty: false})
YIELD modelInfo
RETURN modelInfo
What I 'd like to do, but fails
//Create graph projection
CALL gds.graph.create('train','Train',
{INTERACTS:{orientation:'UNDIRECTED'}},
{nodeProperties:'embeddings_all'})
YIELD graphName, nodeCount, relationshipCount
RETURN graphName, nodeCount, relationshipCount
Then
//Train the model
CALL gds.beta.graphSage.train('train',{
modelName:'proteinModel',
aggregator:'pool',
batchSize:512,
activationFunction:'relu',
epochs:10,
sampleSizes:[25,10],
learningRate:0.0000001,
embeddingDimension:256,
featureProperties:['embeddings_all'],
degreeAsProperty: false})
YIELD modelInfo
RETURN modelInfo
Returns
ClientError: [Procedure.ProcedureCallFailed] Failed to invoke procedure `gds.beta.graphSage.train`: Caused by: org.ejml.MatrixDimensionException: 50 != 1 The 'A' and 'B' matrices do not have compatible dimensions
05-11-2021 10:48 AM
Thanks for posting @song and @emigre459 - you uncovered a bug, and we've just issued a patch release
You can grab the new version off our download center or on github - you'll want 1.5.2
05-11-2021 11:54 AM
That's awesome, thanks @alicia.frame1 ! I'm doing dev testing on Neo4j Desktop right now and it doesn't seem to see the new plugin version (just telling me 1.5.1 is the newest available). Does that need to be indexed or something so that it will show up there? What is the likely timeframe for that if so?
05-11-2021 12:17 PM
hm, there might be a lag before it shows up in desktop.
What you can do - if you want it right away - is to grab the plugin from the download center (download it and unzip the .jar file) and install it manually. In desktop, on your project (with the DB stopped), select "Manage" then click on "open folder," and you can navigate to your plugins folder. Remove the 1.5.1 jar, and replace it with the 1.5.2 version, and then restart your database.
05-11-2021 12:55 PM
Cool, that works. Thanks! A new question arises now that I'm running in GDS v1.5.2: I'm getting this error: Failed to invoke procedure
gds.beta.graphSage.train: Caused by: java.lang.OutOfMemoryError: Java heap space
.
Do you know why that may be? I'm using the same memory config (min of 1 GB heap, max of 4 GB) that I used to successfully train the non-list version of the model, but training the list version seems to use up more memory somehow. I'll test to make sure I can get the previously-working non-list version to work with the new version of GDS, just to be safe, and post here the results.
05-11-2021 01:04 PM
I've confirmed @alicia.frame1 that it runs out of memory for the old way without lists that worked in 1.5.1. Perhaps a memory leak was introduced somehow?
05-11-2021 01:10 PM
I don't think it's a memory leak, but I'll double check.
In general, lists take up quite a bit more space in memory than doubles. The best thing to do is increase your heap, if you can. If you run gds.graphSage.train.estimate
(with everything else the same as if you were running .train
) it should give you as estimate of how much memory it might consume.
05-12-2021 03:29 AM
Ah, thanks for the tip! Upon restarting Neo4j Desktop and adjusting the heap, I was able to get the list version to run. Weirdly, the memory estimate for the list version has a smaller max memory estimate when compared to the non-list version.
All the sessions of the conference are now available online