cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

What library to use instead GDS when graph db is too big to project in memory?

Hello,
I am trying to calculate some metrics (pagerank.....) In a very big graph database that can not be projected in memory. So i think using Neo4j GDS - Graph Data Science is out of question? Can i use GDS without saving all db in memory? Is there some other library i can use? What approace should i use? Thanks in advance for your time!

3 REPLIES 3

How large is your database (nodes/relationships), and how large is the instance you're trying to run on? GDS scales to quite large graphs (we benchmark up to the hundreds of billions of nodes/relationships).

We don't offer any spill over/out of core computations right now. One recommendation might be to use your graph projection step to project a subset of the data, instead of the full graph. You can also try setting sudo: True on any algorithm/projection that is blocked due to lack of memory, to see if it will run anyways (the risk is OOMing your database).

Thank you Alicia for your quick reply ,
well i have many nodes that are in the magnidute of many hundrends of milions, so are some of my relationships. My first thought was to use Neo4j GDSL - Graph Data Science Library. But from what i have read from Neo4j documentation i need to load/ project each part i want to interact/ apply Graph Algorithms in memory. Well in a VM of 32 GB memory and 1 core of 8 threads i installed a Neo4j Server instance community Edition and in that some VM i have to run a Python Client application to load the Graph database with the data that comes from big streaming json Files. I think i can not use Neo4j Graph Data Science Library because of the projection phase. I think the projected part will not fit in the available 16 GB Heap memory. Right now i am a bit of puzzled. Can you Alicia help me? Have you any idea if there is another library that runs Graph Algorithms (like pagerank , community detection, similarities etc) from the disk? Should i implement these algorithms in simple Cypher? Is it possible to use APOC algorithms? I do not know what to do.!!!! Any help would be appreciated! Thanks again for your comment!!!

Alicia,
from what i have understood from the GDSL docs the available Heap memory in the machine that you are running Neo4j instance restricts the part of the Graph db part you want to run Graph Algorithms on. How o overcome this? What are my alternatives?