Neo4j

alexerhardt · ‎03-19-2019

I am building a tool that enables users to recommend sequences of online courses to other users.

Off the back of the data generated from that, I would like to generate insights on what the most recommended sequences of courses are. Here's a slice of the model:

In this graph, the numbers in green are weights that show how many people have recommended one course after the other (ex: 655 people recommend taking Stanford's ML after Intro to Prob)

The recs field in nodes is the absolute number of recommendations that course has (ex: Stanford ML has been featured by 1000 users in sequences)

What I would like to do, is starting from an end goal, finding out the most recommended pre-requisites.

The algorithm might work something like this:

Function fancy_algo (node, graph)
    If (no prereqs OR prereq weight is very low)
        Return graph
    Get all incoming nodes
    For each incoming node subject
        MR = most recommended pre-req
        Append MR to graph
        fancy_algo(MR, graph)

The end state for someone looking to do Stanford's Machine Learning might look something like this:

Note how after "Intro to CS" we haven't included "Algebra", because its prereq weight was very low (20 out of 4567).

Is this something that can be managed with Cypher? How would I get started?

Neo4j

Recursive query with relationship weights