Sitemap

Rand Index (RI) vs. Adjusted Rand Index (ARI) in K-Means Clustering

Rand Index (RI)

3 min readNov 23, 2024

--

The Rand Index is a measure of similarity between two clusterings. It calculates the percentage of correct decisions, comparing the predicted clusters to the true labels. RI takes into account:

  • True Positives (TP): Pairs of points that are in the same cluster in both the predicted and true labels.
  • True Negatives (TN): Pairs of points that are in different clusters in both the predicted and true labels.

The Rand Index is defined as:

RI= TP+TN/TP+TN+FP+FN

Where:

  • FP: Pairs of points that are in the same cluster in the predicted labels but different clusters in the true labels.
  • FN: Pairs of points that are in different clusters in the predicted labels but the same cluster in the true labels.

The Rand Index is between 0 and 1, where:

  • 1 indicates perfect agreement between the two clusterings.
  • 0 indicates no agreement.

Limitation: RI does not correct for random chance, so higher values may be inflated for clusterings that have a large number of clusters.

Adjusted Rand Index (ARI)

The Adjusted Rand Index (ARI) is a corrected-for-chance version of the Rand Index. It accounts for the fact that random cluster assignments can lead to…

--

--

No responses yet