K-Means Clustering

Interactive visualization of the K-Means clustering algorithm

What is K-Means Clustering?

K-Means is an unsupervised learning algorithm that partitions data into K distinct clusters based on feature similarity. The algorithm works by iteratively assigning data points to the nearest cluster center (centroid) and then updating the centroids based on the mean of all points in each cluster.

Algorithm Steps:

  1. Initialize K cluster centers randomly
  2. Assign each data point to the nearest center
  3. Recalculate centers as the mean of assigned points
  4. Repeat steps 2-3 until convergence

Parameters

Cluster Visualization

Cluster Statistics

Run K-Means to see statistics

Understanding K-Means

Advantages

  • Simple and easy to understand
  • Computationally efficient O(nkt)
  • Works well with spherical clusters
  • Scalable to large datasets

Limitations

  • Requires K to be specified in advance
  • Sensitive to initial center placement
  • Assumes spherical clusters
  • Sensitive to outliers
  • May converge to local minima

Time Complexity

O(n × k × t × d)

Where n = number of points, k = number of clusters, t = iterations, d = dimensions