# Solution 3 Computational Intelligence Lab 2012

Adapted from CIL 2012 Master solution.

## 1. Convergence of the K-Means algorithm

(Convergence of the K-Means Algorithm) The K-means algorithm converges since at each iteration it either reduces or keeps the same the value of the objective function J, where

$J=\sum _{n=1}^{N}\sum _{k=1}^{K}z_{k,n}||x_{n}-u_{k}||_{2}^{2}$

with the constraint

$J=\sum _{k=1}^{K}z_{k,n}=1\quad {\text{and}}\quad z_{k,n}\in \{0,1\}.$

### Step 2

When initialising the algorithm we set

$z_{k^{*}(x_{n}),n}=1\quad {\text{and}}\quad z_{k',n}=0$

where

$k^{*}(x_{n})={\text{argmin}}_{k}\left\{||x_{n}-u_{1}||_{2}^{2},\ldots ||x_{n}-u_{k}||_{2}^{2},\ldots ,||x_{n}-u_{K}||_{2}^{2}\right\}.$

This makes the value of $J$ minimal considering that we have to assign the value $1$ to one and only one $z_{k,n}$ and $0$ to all others.

### Step 3

We note that the centroids are updated with this equation:

$u_{k}={\sum _{n=1}^{N}z_{k,n}x_{n} \over \sum _{n=1}^{N}z_{k,n}}\qquad \forall k\in [1,K]$

The derivative of the constraint with respect to $u_{k}$ yields:

${\delta J \over \delta u_{k}}={\delta \sum _{n=1}^{N}z_{k,n}||x_{n}-u_{k}||_{2}^{2} \over \delta u_{k}}=\sum _{n=1}^{N}z_{k,n}{\begin{pmatrix}{\delta (x_{1,n}-u_{1,k})^{2} \over \delta u_{1,k}}\\\vdots \\{\delta (x_{d,n}-u_{d,k})^{2} \over \delta u_{d,k}}\end{pmatrix}}=-2\sum _{n=1}^{N}z_{k,n}(x_{n}-u_{k})$

which can be reformulated to:

$0=\sum _{n=1}^{N}z_{k,n}(x_{n}-u_{k})$

which then yields the equation for the centroid update step.

Computing the second derivative of $J$ results in ${\delta ^{2}J \over \delta u_{k}^{2}}\geq 0$.

Considering the argument in step 1 and 2 convergence guaranteed for the cost function:

$J=\sum _{n=1}^{N}\sum _{k=1}^{K}z_{k,n}||x_{n}-u_{k}||_{2}^{2}$

## 2. Recasting K-Means into a Matrix Factorisation Problem

### Part 2

We note that step 2 of K-means minimises

$k^{*}(x_{n})={\text{argmin}}_{k}\left\{||x_{n}-u_{1}||_{2}^{2},\ldots ||x_{n}-u_{k}||_{2}^{2},\ldots ,||x_{n}-u_{K}||_{2}^{2}\right\}.$

with the constraint $\sum _{k=1}^{K}z_{k,n}=1$ and the values being either $z_{k,n}=1$ or $z_{k,n}=0$. Which yields:

${\text{min}}_{Z}\sum _{n=1}^{N}\sum _{k=1}^{K}||x_{n}-z_{n,k}u_{k}||_{2}^{2}=\sum _{n=1}^{N}((K-1)||x_{n}||_{2}^{2}+{\text{min}}\{||x_{n}-u_{1}||_{2}^{2},\ldots ,||x_{n}-u_{K}||_{2}^{2}\},$

thus $Z$ is minimised with $k^{*}$.

### Part 3

Computing the derivative of

${\text{min}}_{u}\sum _{n=1}^{N}\sum _{k=1}^{K}||x_{n}-z_{k,n}||_{2}^{2}$

yields:

${\delta \over \delta u_{k}}\sum _{n=1}||x_{n}-z_{k,n}u_{k}||_{2}^{2}=-2\sum _{n=1}^{N}z_{k,n}^{2}(x_{n}-z_{k,n}u_{k})$

since $z_{k,n}$ is either 0 or 1 we can conclude that the 3rd step solves

${\text{min}}_{u}\sum _{n=1}^{N}\sum _{k=1}^{K}||x_{n}-z_{k,n}||_{2}^{2}.$

It follows that K-means minimises the objective matrix factorisation problem given by:

$J=||X-UZ||_{2}^{2}.$