cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Whether neo4j can use CUDA's nvGraph library to accelerate the graph computation?

The computation in neo4j is not really fast especialy when there are too many nodes and edegs,so I googled if there are other approches could do better. By accident,I found the CUDA's nvGraph library(https://developer.nvidia.com/nvgraph and https://developer.nvidia.com/discover/graph-analytics),it looks like better to use GPU to accelerate graph computation,so i wander whether neo4j can use this approche or supports this approche in the future?

7 REPLIES 7

My thoughts on it as a Neo4j enthusiast, but not employed by Neo4j...

It would be nice for those of us with good graphic cards, but it would create a parallel development, support and testing path for neo4j staff. Not everyone has a CUDA capable card.

I'd add this to your url list

Something I ran across and would like to try out. At a glance it looks like with just a little bit of work we could move graphs back and forth...

i doubt anyone cannot afford a CUDA capable card...even on my old 2014 laptop i have a GPU with CUDA : https://www.notebookcheck.net/NVIDIA-GeForce-GT-520M.43104.0.html

Hi Gabriel, Thank you for sharing the paper reference, I appreciate it.

Personally I'm for GPU enhancements, I'd like to have it! However realistically speaking...

will everyone already have a reasonably recent good NVIDIA CUDA card when they first download and try to run neo4j? I think it is is safe to say that the answer (in general) is No

Also, it needs to be mentioned that not all CUDA capable cards are equal. If the graph/subgraph doesn't fit in GPU ram, the utility of the GPU becomes less certain.
Design thought: perhaps neo4j could treat GPU graphs as just another form of "in-memory" graph, it fits in GPU ram or it doesn't, just like the libraries work now with regular ram... still a parallel dev/test path but more narrowly scoped?

References:

who's buying what cards?


I think the main issue is that any foray into this would be messy to support, at best. A new parallel design, develop and test effort.

I would advise you to study a little bit on what a GPU is, especially how it processes data. Also take a look for a comparison to how a CPU processes data. There are millions such videos on YouTube, so you start with the beginning of https://youtu.be/6stDhEA0wFQ
Not saying it's the best video, it just explains the difference.
Thus you will understand that the way a GPU accesses it's memory is not at all how a CPU uses it's memory, so, "a graph fitting GPU memory" doesn't really makes sense.
I imagine you think of a graph as an 3d space as if it were a level of a 3d game.
It's not.
A graph is a dynamically linked chain of pointers, and has nothing to do with the 3d space.
So, in terms of graph specific calculations, other that using a GPU for specific math calculations, i would find no real use for it.
Visualizations of a graph is a different topic, and there, yes, you could use a GPU to render a graph in 3d - but again - do not confuse a visualization of the graph with the graph itself.

I thought it would have been clear to you that CUDA cores are a trivial thing nowadays, especially as my 8 years old laptop GPU had a bunch of them. It seems it is not, but this is just another thing you can document on.

Thanks for your reply,it's very useful! I found rapidsai open source project in CUDA website too,and I wanna try it.I agree with you that it's hard to get sustaining support when it is in dev/test path.But,when we look at the history of coumputer science,we will get more data and more powerful tool(like cpu、storage、gpu etc) obviously. As a developer who wants to use graph model in our bussiness,i found a huge gap between storage and computation when we wants to apply graph analysis on our bussiness data(more than 10 million nodes and edges).There seems two way to solve the problem,the one is map/reduce method(Spark GraphX) which split data and send it to many distributed client to speed up computation.the other one is use parallel device(GPU) to speed up computation.Personnally,I perfer the last one,because GPU is more fit for computing than CPU.So,it's much better if Neo4j can utilize GPU.The next work we plan to do is load data from Neo4j and compute in GPU to find out whether it really meet our requirements.

Hi Gabriel,

Thank you for your thoughts. I was running on a "GeForce GTX 560 Ti" for the past 10 years and it falls in to the same capability version as your card (2.1), and I believe that card is quite capable as well!

However, I stand by my assessment based on my background and actual hands-on experience. I think a reasonable goal might be as I stated earlier to "use the cards to the limited capability that they have" if possible, though each generation of cards supports different CUDA versions and functions. I've had to deal with version support, and it can be nightmare to support multiple versions of anything.

It may surprise you to know that I'd be very happy to be proven wrong with an actual (simple to support?) implementation that could leverage any GPU to achieve performance gains here, including the old GPUs? Wouldn't that be great?

In fact, I personally think it would be fun to take on the challenge, I would really enjoy trying to prove myself wrong, if we could get it resourced, I'd work on that project.

References:

CUDA toolkit
Supported CUDA level of GPU and card