cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Does not get the same result as Wikipedia:PageRank

cnhx27
Node Link

I don't obtain same result as Wikipedia:PageRank

Cypher statement

MERGE (A:Page {name:"A"})
MERGE (B:Page {name:"B"})
MERGE (C:Page {name:"C"})
MERGE (D:Page {name:"D"})
MERGE (E:Page {name:"E"})
MERGE (F:Page {name:"F"})
MERGE (G:Page {name:"G"})
MERGE (H:Page {name:"H"})
MERGE (I:Page {name:"I"})
MERGE (J:Page {name:"J"})
MERGE (K:Page {name:"K"})
MERGE (B)-[:LINKS]->(C)
MERGE (C)-[:LINKS]->(B)
MERGE (D)-[:LINKS]->(A)
MERGE (D)-[:LINKS]->(B)
MERGE (E)-[:LINKS]->(B)
MERGE (E)-[:LINKS]->(D)
MERGE (E)-[:LINKS]->(F)
MERGE (F)-[:LINKS]->(B)
MERGE (F)-[:LINKS]->(E)
MERGE (G)-[:LINKS]->(B)
MERGE (G)-[:LINKS]->(E)
MERGE (H)-[:LINKS]->(B)
MERGE (H)-[:LINKS]->(E)
MERGE (I)-[:LINKS]->(B)
MERGE (I)-[:LINKS]->(E)
MERGE (J)-[:LINKS]->(E)
MERGE (K)-[:LINKS]->(E)

Run the PageRank algorithm

CALL algo.pageRank.stream("Page", "LINKS", {iterations:20})
YIELD nodeId, score
MATCH (node) WHERE id(node) = nodeId
RETURN node.name AS page,score
ORDER BY score DESC

Results

╒══════╤═══════════════════╕
│"page"│"score"            │
╞══════╪═══════════════════╡
│"B"   │3.422628279216587  │
├──────┼───────────────────┤
│"C"   │3.044408493675292  │
├──────┼───────────────────┤
│"E"   │0.7503552452218019 │
├──────┼───────────────────┤
│"D"   │0.36260065069613195│
├──────┼───────────────────┤
│"F"   │0.36260065069613195│
├──────┼───────────────────┤
│"A"   │0.30410527815336796│
├──────┼───────────────────┤
│"J"   │0.15000000000000002│
├──────┼───────────────────┤
│"K"   │0.15000000000000002│
├──────┼───────────────────┤
│"G"   │0.15000000000000002│
├──────┼───────────────────┤
│"H"   │0.15000000000000002│
├──────┼───────────────────┤
│"I"   │0.15000000000000002│
└──────┴───────────────────┘

For B Wikipedia: 38.4% Neo4j 34.2 %
Is that normal?

1 REPLY 1

Your code snippet for creating the graph is different than the example we have in our docs - so it makes sense that you're getting a different answer. Our example includes home,about,product, links, and you have 11 nodes vs. the 8 in our docs. PageRank calculates the score based on assigning scores based on the number of edges per node, and propagating that across the network, so you should get different results based on different numbers on nodes.

If you're just wondering why you get different numbers than the picture on the wikipedia page... there are lots of parameters you can set (tolerance, iterations, damping factor) that will give you different values. In the code you ran, you're setting iterations to 20 and just using default tolerance and damping -- there's no way to know what was used to calculate the results in that image.