A Google TechTalk, presented by Václav Rozhoň, 2023-04-13
Abstract: The famous k-means algorithm of Arthur and Vassilvitskii is the most popular practical algorithm for solving the k-means problem. The algorithm is very simple and computes the k output centers as follows: it samples the first center as a uniformly random point in the dataset and each of the following k−1 centers is then always sampled with probability proportional to the squared distance to the currently closest center. Amazingly, the k-means algorithm is known to return a Θ(log k) approximate solution in expectation.
In their seminal work, Arthur and Vassilvitskii asked about the guarantees of its following greedy variant: in every step, we sample ℓ candidate centers instead of one and then pick the one that minimizes the new cost. This is also how k-means is implemented in e.g. the popular Scikit-learn library. We analyze greedy k-means : We prove that it is an O(ℓ^3 * log^3 k)-approximation algorithm and provide a near-matching lower bound.
Joint work with Christoph Grunau, Ahmet Alper Özüdoğru, Jakub Tětek
arxiv:
Bio: Vaclav Rozhon is a PhD student at ETH Zurich advised by Mohsen Ghaffari. He works mostly on distributed and parallel algorithms; he also creates YouTube videos about algorithms (channel name: polylog). He has a young child and thus no hobbies.
A Google Talk Series on Algorithms, Theory, and Optimization
1 view
0
0
6 days ago 00:00:25 2
❄️🚶♀️ Miami Moment: Brisk Walk in Chilly Weather 🌴🌬️
2 weeks ago 00:09:34 1
Exclusive❗Russian Army Destroys Columns of Ukrainian Equipment in Kursk Region
2 weeks ago 00:00:58 1
U-boats - The Submarines That Almost Won WWII (Part 7)
4 weeks ago 00:10:08 1
Naturelife Bungalows & SPA / Çıralı - Antalya
4 weeks ago 00:01:10 1
Gauleiter Erich Koch
4 weeks ago 00:03:57 1
Zoë - Alba (Il Tempo) | Sofar Turin
4 weeks ago 00:02:01 1
Three Honors Shen Skin Spotlight - Pre-Release - PBE Preview - League of Legends
4 weeks ago 00:11:42 1
SAIBA QUAIS SÃO! As 4 novas estratégias de Bolsonaro no STF