---
layout: post
title: K-Means
categories: [Hacking]
tags: [Software, _Phase4]
---
_This is the first post in a brief series on algorithms (simple methods) that I like and that I think more people should know about: either because of their simplicity, their novelty, their usefulness, or all three. Each comes with a live demo. To begin, let's introduce: **k-Means Clustering** ._
{: .notice--info}
### The Idea
This sample shows _k-Means_ in action. k-Means used to also be known as _Lloyd's method._ It's so simple that I'm surprised it's not used more often.
The basic idea is: for some pile of info, whether its sales records for varying shoe sizes, the location of weeds in your yard, whatever, find a set of "best middles" — or "means" — for any given set of information. In the sample, we have a bunch of random, possibly-lumpy collections of little x's on the canvas. The method tries to find some smaller number of categories (the "means," marked by circles) that could best-represent that info.
The "k" in "k-Means" is the number of categories we want to create for dividing-up the samples.
The k-Means method doesn't directly solve for "best." It just keeps trying and adjusting, starting from completely random guesses as to where to put the means:
* Look at every point, and mark it "owned" by the closest "mean."
* Move each mean to the middle of its "owned" points.
* _Just keep doing those two steps,_ until either nothing changes each time or you run out of patience. If a proposed mean has _no_ points, just discard that mean. You didn't need it.
---
### The Example Code
In this sample, we let the means move around until they're stable, then randomly take one away, which forces all the rest to re-examine their situatons.
You can pause/start the animation by clicking on the canvas.
You can see the [source here on GitHub.](https://raw.githubusercontent.com/joker-b/botzo/master/_posts/2018-12-05-K-Means.md) The entire example is contained in a single Javascript module that's imaginatively named `kmSample1`
You can decide for yourself what these results look like: Popping soap bubbles? Growing algae mats in time-lapse? Voting districts? That dream about insects eating your car? I suggest turning on some [Cinematic Orchestra](https://www.cinematicorchestra.com/) for your headphones and stare at this page for a long while to decide.
In future examples, we'll try using k-Means (and other tricks) in different ways. Some obvious questions might be:
* Are there better or [different ways to decide which point belongs to which mean?]({{ site.baseurl}}{% post_url 2018-12-07-Special-K %}) What do we mean by "closeness"?
* What might this have to do with [photography]({{ site.baseurl}}{% post_url 2018-12-10-K-Pix %})?