cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Index -1 out of bounds for length 28088 when calling node2vec in write mode and relationshipWeightProperty set

Hi guys,

I hope it is the right place to ask this question. I am evaluating node2vec for one of the DS problems at my hands.

What I have is a simple bipartite graph, with main nodes and a set of properties(represented as nodes) that the main nodes are connected to. And I want to run node2vec to get node embeddings. For that, I create a native projection in the following way:

call gds.graph.create(
    "my-graph",
    "*",
    {
        DATA: {
            type: "DATA",
            orientation: "UNDIRECTED",
            properties : {
                weight: "weight"
            }
        }
    },
    {
        relationshipProperties:[{DATA:"weight"}]
    }
)
YIELD graphName, nodeCount, relationshipCount;

So it is a simple graph with a single relationship type. Notice, the native projection also utilizes the weight property of each relation.

To actually adjust the embeddings, I run the following:

CALL gds.beta.node2vec.write('er-graph', {
    embeddingDimension: 64,
    writeProperty: "node2vec",
    concurrency: 4,
    iterations: 200,
    inOutFactor: 1,
    returnFactor: 0.5,
    walksPerNode: 20,
    walkLength: 10,
})

Please, note this one does not try to use relationship weights. When I call it, the procedure runs as expected and generates some reasonable embeddings.
However, when I run the node2vec with relationshipWeightProperty property set, I get an error:

CALL gds.beta.node2vec.write('er-graph', {
    embeddingDimension: 64,
    writeProperty: "node2vec",
    concurrency: 4,
    iterations: 200,
    inOutFactor: 1,
    returnFactor: 0.5,
    walksPerNode: 20,
    walkLength: 10,
    relationshipWeightProperty:"weight"
})

This leads to the following log:

neo4j          | 2021-06-23 13:13:58.296+0000 ERROR Client triggered an unexpected error [Neo.DatabaseError.General.UnknownError]: Lexical error at line 17, column 46.  Encountered: <EOF> after : "", reference b95a99d5-7c8f-4b0a-90fb-0d321abbd89e.
neo4j          | 2021-06-23 13:14:07.870+0000 INFO  [node-store-scan-0] LOADING 16%
neo4j          | 2021-06-23 13:14:07.871+0000 INFO  Node Store Scan (NodeCursorBasedScanner): Imported 28,088 records and 0 properties from 415 KiB (425,880 bytes); took 0.002 s, 17,157,157.68 Nodes/s, 248 MiB/s (260,142,776 bytes/s) (per thread: 4,289,289.42 Nodes/s, 62 MiB/s (65,035,694 bytes/s))
neo4j          | 2021-06-23 13:14:07.905+0000 INFO  [relationship-store-scan-3] LOADING 41%
neo4j          | 2021-06-23 13:14:07.920+0000 INFO  [relationship-store-scan-0] LOADING 70%
neo4j          | 2021-06-23 13:14:07.920+0000 INFO  [relationship-store-scan-1] LOADING 100%
neo4j          | 2021-06-23 13:14:07.927+0000 INFO  Relationship Store Scan (RelationshipScanCursorBasedScanner): Imported 140,368 records and 140,368 properties from 2334 KiB (2,390,880 bytes); took 0.053 s, 2,659,020.13 Relationships/s, 43 MiB/s (45,290,935 bytes/s) (per thread: 664,755.03 Relationships/s, 11057 KiB/s (11,322,733 bytes/s))
neo4j          | 2021-06-23 13:14:07.928+0000 INFO  [neo4j.BoltWorker-3 [bolt] [/172.18.0.1:57292] ] LOADING 
neo4j          | 2021-06-23 13:14:07.936+0000 INFO  [neo4j.BoltWorker-3 [bolt] [/172.18.0.1:57292] ] LOADING Actual memory usage of the loaded graph: 22 MiB
neo4j          | Exception in thread "Thread-17" java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 28088
neo4j          |        at org.neo4j.graphalgo.core.utils.paged.HugeIntArray$SingleHugeIntArray.get(HugeIntArray.java:280)
neo4j          |        at org.neo4j.graphalgo.core.huge.TransientAdjacencyList.degree(TransientAdjacencyList.java:154)
neo4j          |        at org.neo4j.graphalgo.core.huge.HugeGraph.degree(HugeGraph.java:316)
neo4j          |        at org.neo4j.gds.embeddings.node2vec.RandomWalk$RandomWalkTask.walkOneStep(RandomWalk.java:275)
neo4j          |        at org.neo4j.gds.embeddings.node2vec.RandomWalk$RandomWalkTask.walk(RandomWalk.java:261)
neo4j          |        at org.neo4j.gds.embeddings.node2vec.RandomWalk$RandomWalkTask.run(RandomWalk.java:244)
neo4j          |        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
neo4j          |        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
neo4j          |        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
neo4j          |        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
neo4j          |        at java.base/java.lang.Thread.run(Thread.java:829)
neo4j          |        at org.neo4j.internal.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:110)
neo4j          |        Suppressed: java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 28088
neo4j          |                ... 12 more
neo4j          |        Suppressed: java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 28088
neo4j          |                ... 12 more
neo4j          |        Suppressed: java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 28088
neo4j          |                ... 12 more
neo4j          | 2021-06-23 13:15:56.016+0000 INFO  [neo4j.BoltWorker-3 [bolt] [/172.18.0.1:57292] ] Node2Vec :: Training :: Start
neo4j          | 2021-06-23 13:15:56.016+0000 INFO  [neo4j.BoltWorker-3 [bolt] [/172.18.0.1:57292] ] Node2Vec :: Iteration 1 :: Start
neo4j          | 2021-06-23 13:15:56.016+0000 INFO  [neo4j.BoltWorker-3 [bolt] [/172.18.0.1:57292] ] Node2Vec :: Iteration 1 :: Finished
.
.
.
neo4j          | 2021-06-23 13:15:56.239+0000 INFO  [neo4j.BoltWorker-3 [bolt] [/172.18.0.1:57292] ] WriteNodeProperties :: Finished

So as you can see, there are some weird exceptions, and the iterations start and stop almost immediately. The embeddings produced also are not being optimized. They get initialised with random values and never change.

Am I trying to use weighted transition in a wrong way?
Thanks!

1 ACCEPTED SOLUTION

Update - realized 1.6.2 was available and that seems to have fixed it 🙂

View solution in original post

3 REPLIES 3

Which version of the library are you using? If you are not yet at 1.6.0 please try to upgrade first.

Otherwise, Could you please create a GH issue here:

@michael.hunger I also have an issue with an ArrayIndexOutOfBounds exception. It occurs when I attempt to create a graph (syntax below). At first it was intermittent but now seems consistent. Nearly all the forum posts I've seen for this error recommend updating to the latest version. I am on 1.6.1. Is there anything else to do or should I open a ticket? Thank you for your help.

CALL gds.graph.create.cypher( 
'graphName',
'MATCH (n) WHERE n:Thing1 OR n:Thing2 OR n:Thing3 OR n:Thing4 RETURN id(n) AS id, labels(n) AS labels',
'MATCH (k:Thing1)-[:RelationshipType]->(m:Thing5)-[r]->(n) WHERE datetime(m.time) >= datetime() - duration({hours: 2}) RETURN id(u) AS source, id(n) AS target, count(m) AS weight')

Update - realized 1.6.2 was available and that seems to have fixed it 🙂

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online