K Means Clustering
- Clustering is the task of taking data and generating clusters from
it.
- It's an unsupervised task.
- K Means (pg. 753) is a simple clustering algorithm.
- Given a set of data points, pick K cluster centers randomly.
- Go through all of the data points and put them in the cluster
that they are nearest to. (How do you calculate nearness?)
- Move the K cluster points to the center of their cluster.
(How do you calculate the center of the cluster?)
- Repeat until no data points move clusters.
- Here you need to pick the number K, but otherwise the
algorithm does all the work.