Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-22-2020 02:13 AM
Hi,
I am very new in the subject of GDS and don't have a database (yet) to test some things.
I am having difficulties in understanding the execution modes stream, write, mutate and stats. When to use one and when the other? Any use cases and reasons the specific mode was used? I understand the concept but an example will really help.
Also, when to use stored graphs and when to use anonymous graphs. Again, are there any example use cases and reasons for choosing one projection over the other?
I know it's a long discussion, that's why I am asking for resources.
Thanks.
05-22-2020 08:15 AM
I am having difficulties in understanding the execution modes stream , write , mutate and stats .
You can read more about the modes here: https://neo4j.com/docs/graph-data-science/current/common-usage/running-algos/#running-algos-stats
Also, when to use stored graphs and when to use anonymous graphs . Again, are there any example use cases and reasons for choosing one projection over the other?
Both of these modes load an in memory graph.
When you're getting started I would use the anonymous graphs. The anonymous graph approach loads the Neo4j graph into memory for each algorithm that you run.
The stored graph technique is a more advanced feature. I would use that when you want to run multiple algorithms over the same graph.
05-22-2020 11:13 AM
Thanks for the extensive reply.
I guess what bothers me is the speed of the results so I won’t know till I try.
I am going to do a recommendation page, think of it exactly like Instagram which if you go to “browse” you see personalised recommendations for other accounts you may want to follow.
So I’m going to use the personalised pagerank algo to provide recommendations to my users (using sourcenodes). For starters I’m thinking of stream mode on an anonymous graph. What bothers me is, what happens if you have millions of users (and each one follows 50+ accounts etc). Won’t that algo become slow especially as you grow?
05-22-2020 01:16 PM
For this use case I think you'd want to have a named graph then. With a named graph it means that you are doing the loading of the in memory projected graph up front.
So in the diagram above it will mean that steps 1 & 2 are done at the beginning and won't need to be re-run every time that you run the PageRank algorithm.
I haven't tried running the PageRank algorithm in a more 'real time' like scenario so I'd suggest testing it out on a sample dataset to make sure that you're going to get the performance that you need.
All the sessions of the conference are now available online