## Similarity Measurement of Websites

Measuring similarity is a problem needed in all kinds of fields. And SimRank is an intuitive and general approach in the similarity measure. It is applicable in any domain with object-to-object relationships, measuring the similarity of an object based on the relationship with other objects.

The key of SimRank is

Two objects are considered to be similar if they are referenced by similar objects.

We will briefly introduce the algorithm and walkthrough the Python implementation from scratch.

Feel free to check out the well-commented source code. It could really help to understand the whole algorithm.

The algorithm steps are listed below

• Initialize the SimRank of every pair of the nodes following
`if(node1 == node2):SimRank(node1, node2) = 1else:SimRank(node1, node2) = 0`
• For each iteration, update the SimRank of every pair of nodes in the graph
• If both nodes are the same, SimRank(a, b) = 1
• If one of the nodes has no in-neighbors, SimRank(a,b) = 0
• Else, the new SimRank follows the equation