K-Means Clustering
Interactive visualization of the K-Means clustering algorithm
What is K-Means Clustering?
K-Means is an unsupervised learning algorithm that partitions data into K distinct clusters based on feature similarity. The algorithm works by iteratively assigning data points to the nearest cluster center (centroid) and then updating the centroids based on the mean of all points in each cluster.
Algorithm Steps:
- Initialize K cluster centers randomly
- Assign each data point to the nearest center
- Recalculate centers as the mean of assigned points
- Repeat steps 2-3 until convergence
Parameters
Cluster Visualization
Cluster Statistics
Run K-Means to see statistics
Understanding K-Means
Advantages
- Simple and easy to understand
- Computationally efficient O(nkt)
- Works well with spherical clusters
- Scalable to large datasets
Limitations
- Requires K to be specified in advance
- Sensitive to initial center placement
- Assumes spherical clusters
- Sensitive to outliers
- May converge to local minima
Time Complexity
O(n × k × t × d)
Where n = number of points, k = number of clusters, t = iterations, d = dimensions