K-Means Clustering

Interactive visualization of the K-Means clustering algorithm

What is K-Means Clustering?

K-Means is an unsupervised learning algorithm that partitions data into K distinct clusters based on feature similarity. The algorithm works by iteratively assigning data points to the nearest cluster center (centroid) and then updating the centroids based on the mean of all points in each cluster.

Algorithm Steps:

Initialize K cluster centers randomly
Assign each data point to the nearest center
Recalculate centers as the mean of assigned points
Repeat steps 2-3 until convergence

Parameters

Dataset Type

Dataset

Number of Clusters (K)3

Number of Samples300

Noise Level0.10

Animation Speed (ms)500ms

Cluster Visualization

Cluster Statistics

Run K-Means to see statistics

Understanding K-Means

Advantages

Simple and easy to understand
Computationally efficient O(nkt)
Works well with spherical clusters
Scalable to large datasets

Limitations

Requires K to be specified in advance
Sensitive to initial center placement
Assumes spherical clusters
Sensitive to outliers
May converge to local minima

Time Complexity

O(n × k × t × d)

Where n = number of points, k = number of clusters, t = iterations, d = dimensions