# Solution 3 Computational Intelligence Lab 2012

Adapted from CIL 2012 Master solution.

## 1. Convergence of the K-Means algorithm

(Convergence of the K-Means Algorithm) The K-means algorithm converges since at each iteration it either reduces or keeps the same the value of the objective function J, where

${\displaystyle J=\sum _{n=1}^{N}\sum _{k=1}^{K}z_{k,n}||x_{n}-u_{k}||_{2}^{2}}$

with the constraint

${\displaystyle J=\sum _{k=1}^{K}z_{k,n}=1\quad {\text{and}}\quad z_{k,n}\in \{0,1\}.}$

### Step 2

When initialising the algorithm we set

${\displaystyle z_{k^{*}(x_{n}),n}=1\quad {\text{and}}\quad z_{k',n}=0}$

where

${\displaystyle k^{*}(x_{n})={\text{argmin}}_{k}\left\{||x_{n}-u_{1}||_{2}^{2},\ldots ||x_{n}-u_{k}||_{2}^{2},\ldots ,||x_{n}-u_{K}||_{2}^{2}\right\}.}$

This makes the value of ${\displaystyle J}$ minimal considering that we have to assign the value ${\displaystyle 1}$ to one and only one ${\displaystyle z_{k,n}}$ and ${\displaystyle 0}$ to all others.

### Step 3

We note that the centroids are updated with this equation:

${\displaystyle u_{k}={\sum _{n=1}^{N}z_{k,n}x_{n} \over \sum _{n=1}^{N}z_{k,n}}\qquad \forall k\in [1,K]}$

The derivative of the constraint with respect to ${\displaystyle u_{k}}$ yields:

${\displaystyle {\delta J \over \delta u_{k}}={\delta \sum _{n=1}^{N}z_{k,n}||x_{n}-u_{k}||_{2}^{2} \over \delta u_{k}}=\sum _{n=1}^{N}z_{k,n}{\begin{pmatrix}{\delta (x_{1,n}-u_{1,k})^{2} \over \delta u_{1,k}}\\\vdots \\{\delta (x_{d,n}-u_{d,k})^{2} \over \delta u_{d,k}}\end{pmatrix}}=-2\sum _{n=1}^{N}z_{k,n}(x_{n}-u_{k})}$

which can be reformulated to:

${\displaystyle 0=\sum _{n=1}^{N}z_{k,n}(x_{n}-u_{k})}$

which then yields the equation for the centroid update step.

Computing the second derivative of ${\displaystyle J}$ results in ${\displaystyle {\delta ^{2}J \over \delta u_{k}^{2}}\geq 0}$.

Considering the argument in step 1 and 2 convergence guaranteed for the cost function:

${\displaystyle J=\sum _{n=1}^{N}\sum _{k=1}^{K}z_{k,n}||x_{n}-u_{k}||_{2}^{2}}$

## 2. Recasting K-Means into a Matrix Factorisation Problem

### Part 2

We note that step 2 of K-means minimises

${\displaystyle k^{*}(x_{n})={\text{argmin}}_{k}\left\{||x_{n}-u_{1}||_{2}^{2},\ldots ||x_{n}-u_{k}||_{2}^{2},\ldots ,||x_{n}-u_{K}||_{2}^{2}\right\}.}$

with the constraint ${\displaystyle \sum _{k=1}^{K}z_{k,n}=1}$ and the values being either ${\displaystyle z_{k,n}=1}$ or ${\displaystyle z_{k,n}=0}$. Which yields:

${\displaystyle {\text{min}}_{Z}\sum _{n=1}^{N}\sum _{k=1}^{K}||x_{n}-z_{n,k}u_{k}||_{2}^{2}=\sum _{n=1}^{N}((K-1)||x_{n}||_{2}^{2}+{\text{min}}\{||x_{n}-u_{1}||_{2}^{2},\ldots ,||x_{n}-u_{K}||_{2}^{2}\},}$

thus ${\displaystyle Z}$ is minimised with ${\displaystyle k^{*}}$.

### Part 3

Computing the derivative of

${\displaystyle {\text{min}}_{u}\sum _{n=1}^{N}\sum _{k=1}^{K}||x_{n}-z_{k,n}||_{2}^{2}}$

yields:

${\displaystyle {\delta \over \delta u_{k}}\sum _{n=1}||x_{n}-z_{k,n}u_{k}||_{2}^{2}=-2\sum _{n=1}^{N}z_{k,n}^{2}(x_{n}-z_{k,n}u_{k})}$

since ${\displaystyle z_{k,n}}$ is either 0 or 1 we can conclude that the 3rd step solves

${\displaystyle {\text{min}}_{u}\sum _{n=1}^{N}\sum _{k=1}^{K}||x_{n}-z_{k,n}||_{2}^{2}.}$

It follows that K-means minimises the objective matrix factorisation problem given by:

${\displaystyle J=||X-UZ||_{2}^{2}.}$