# Smoothed analysis of the k-means method

Arthur, David
and
Manthey, Bodo
and
Röglin, Heiko
(2011)
*Smoothed analysis of the k-means method.*
Journal of the ACM, 58
(5).
p. 19.
ISSN 0004-5411

PDF
Restricted to UT campus only : Request a copy 433kB |

Abstract: | The k-means method is one of the most widely used clustering algorithms, drawing its popularity from its speed in practice. Recently, however, it was shown to have exponential worst-case running time. In order to close the gap between practical performance and theoretical analysis, the k-means method has been studied in the model of smoothed analysis. But even the smoothed analyses so far are unsatisfactory as the bounds are still super-polynomial in the number n of data points. In this article, we settle the smoothed running time of the k-means method. We show that the smoothed number of iterations is bounded by a polynomial in n and 1/sigma, where sigma is the standard deviation of the Gaussian perturbations. This means that if an arbitrary input data set is randomly perturbed, then the k-means method will run in expected polynomial time on that input set. |

Item Type: | Article |

Copyright: | © 2011 IEEE |

Faculty: | Electrical Engineering, Mathematics and Computer Science (EEMCS) |

Research Group: | |

Link to this item: | http://purl.utwente.nl/publications/78342 |

Official URL: | http://dx.doi.org/10.1145/2027216.2027217 |

Export this item as: | BibTeX EndNote HTML Citation Reference Manager |

Repository Staff Only: item control page