Table of Contents

XIII. Clustering

13.1 - Unsupervised Learning: Introduction

13.2 - K-Means Algorithm

Algorithm

  1. Randomly initialier K cluster centroids \(\mu_1,\mu2, \dots, \mu_K \in \mathbb{R}^n\).
  2. Repeat:
    1. for i=1 to m: \(c^{(i)}\) := index (from 1 to K) of cluster centroid closest to \(x^{(i)}\) (we use the distance \(\|x-\mu_k\|\)).
    2. for k=1 to K: \(\mu_k\) := average (mean) of points assiged to cluster k.

K-means for non-separated clusters

13.3 - Optimization Objective

\[J(c^{(1)},\dots,c^{(m)},\mu_1,\dots,\mu_K) = \frac{1}{m} \sum\limits_{i=1}^m \|x^{(i)} - \mu_{c^{(i)}}\|^2 \]

13.4 - Random Initialization

13.5 - Choosing the number of clusters