Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
04-14-2021 02:26 AM
Hi,
I'm trying to iterate over all nodes in a named virtual graph (created by gds.graph.create), and I can't seem to find a function that streams/returns all nodeIds back without doing any additional computation.
For example:
CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
will return what I need, but will perform additional work (computing the score for each node).
I only need to get the nodeIds from the graph.
Am I missing something or do I need to fork gds and write my own algorithm for this?
Thanks!
Naveh.
04-14-2021 04:16 AM
04-14-2021 06:31 AM
04-17-2021 11:19 PM
Thanks for your replies!
Unfortunately I do need the graph for other GDS functions later on.
When I use gds.graph.streamNodeProperty or gds.graph.streamNodeProperties I get this error:
Failed to invoke procedure
gds.graph.streamNodeProperty: Caused by: java.lang.IllegalArgumentException: No node projection with property key(s) ['nodeId'] found.
For testing purposes, the graph is created with all nodes, and all relationship of a certain type, like this:
CALL gds.graph.create('myGraph', '*', 'RELATIONSHIP_TYPE', {})
05-25-2021 11:04 AM
I'm also looking for an answer to this. I think this rewording might be good for clarification:
How do we create a GDS graph (e.g., using a native projection) and then simply retrieve the nodes and relationships in that graph without applying a specific graph algorithm?
gds.graph.streamNodeProperty
only works once you apply an algorithm to the named graph, but I need to instead retrieve the contents of the graph for use in an external graph machine learning framework.
gds.graph.export
and gds.beta.graph.export.csv
are unfortunately not suitable due to the client/server configuration of our Neo4j database.
05-27-2021 01:41 PM
If you're not running any GDS procedures or algorithms, do you need to use the in-memory graph? Or is it simpler just to use the cypher id()
function to retrieve the data?
07-07-2021 09:21 PM
Sorry, I totally missed this reply!
I'm not sure how the id()
function would solve my use case. The database I'm working with contains in the order of 1M nodes and 2M relationships, and I need a way to rapidly stream native projections of that database (containing, for example, 700k nodes and 1.2M relationships) into an external machine learning model implemented in Python.
The only way I can see the id()
function helping is if I were to tediously build Cypher queries that identified a spanning tree of each of the nodes in my native projection, but that seems both computationally infeasible and inflexible.
To make it a little more concrete, the database I'm working with has nodes corresponding to different types of biological entities (chemicals, genes, metabolic pathways, diseases, etc) and I'm trying to extract the subgraph containing 3 specific node types ("Chemical", "Gene", and "Assay") and all of the relationships linking those nodes.
All the sessions of the conference are now available online