--- layout: post title: K-Means categories: [Hacking] tags: [Software, _Phase4] ---
Live demo: Click to Pause/Resume
_This is the first post in a brief series on algorithms (simple methods) that I like and that I think more people should know about: either because of their simplicity, their novelty, their usefulness, or all three. Each comes with a live demo. To begin, let's introduce: **k-Means Clustering** ._ {: .notice--info} ### The Idea This sample shows _k-Means_ in action. k-Means used to also be known as _Lloyd's method._ It's so simple that I'm surprised it's not used more often. The basic idea is: for some pile of info, whether its sales records for varying shoe sizes, the location of weeds in your yard, whatever, find a set of "best middles" — or "means" — for any given set of information. In the sample, we have a bunch of random, possibly-lumpy collections of little x's on the canvas. The method tries to find some smaller number of categories (the "means," marked by circles) that could best-represent that info. The "k" in "k-Means" is the number of categories we want to create for dividing-up the samples. The k-Means method doesn't directly solve for "best." It just keeps trying and adjusting, starting from completely random guesses as to where to put the means: * Look at every point, and mark it "owned" by the closest "mean." * Move each mean to the middle of its "owned" points. * _Just keep doing those two steps,_ until either nothing changes each time or you run out of patience. If a proposed mean has _no_ points, just discard that mean. You didn't need it. --- ### The Example Code In this sample, we let the means move around until they're stable, then randomly take one away, which forces all the rest to re-examine their situatons. You can pause/start the animation by clicking on the canvas. You can see the [source here on GitHub.](https://raw.githubusercontent.com/joker-b/botzo/master/_posts/2018-12-05-K-Means.md) The entire example is contained in a single Javascript module that's imaginatively named `kmSample1` You can decide for yourself what these results look like: Popping soap bubbles? Growing algae mats in time-lapse? Voting districts? That dream about insects eating your car? I suggest turning on some [Cinematic Orchestra](https://www.cinematicorchestra.com/) for your headphones and stare at this page for a long while to decide. In future examples, we'll try using k-Means (and other tricks) in different ways. Some obvious questions might be: * Are there better or [different ways to decide which point belongs to which mean?]({{ site.baseurl}}{% post_url 2018-12-07-Special-K %}) What do we mean by "closeness"? * What might this have to do with [photography]({{ site.baseurl}}{% post_url 2018-12-10-K-Pix %})?