Lösungsvorschlag Computational Intelligence Lab FS16

Aus VISki
Wechseln zu: Navigation, Suche

Dimensionality reduction

PCA and SVD

live together in peace and harmony :) scnr

  • False: Eigenvalue decomposition is performed on the centered variance-covariance matrix.
  • True
  • True: Rotating only changes the eigenvectors. The error is only influenced by the eigenvalues.
  • False: The rating does depend on the initialization of the unknown values, therefore the performance is also affected.
  • False: Depends on whether the unknown entries are also filled the same way. If they are the same, then it's True.
  • False: No assumptions about the meaning of the feature vectors can be made.

PCA

Part 1

a)

Let be the SVD of . We then have to diagonalise :

b)

PCA has to find the Eigenvalue decomposition of . This is usually done via power method. Since is assumed to be large we perform many matrix multiplications with a huge matrix which is slow. In SVD we can first calculate and then do the eigenvalue decomposition on this way smaller matrix.

The solution above is incorrect, since the problems explicitly states that .

Part 2

a)

b)

We project the points onto the principle direction

And get the new coordinates

c)

.

The above mentioned solution is wrong. The task is to calculate the variance of the projected data. Therefore, .

d)

0, there is no error, as the first principal component explains all the data.

SVD

a)

b)

c)

(symmetric matrices have orthogonal EVs)

Clustering Mixture Models, NMF

Mixture Model and Expectation Maximization

a)

K-Means assigns each data point to exactly one centroid.


A GMM softly assignes a point to several clusters with the weights.


K-Means assumes sphere-shaped clusters of same size, GMM clusters can fit any ellipsoid shape (by fitting the covariance matrix).

b)

c)

d)

e)

Well just take the chain rule and derive the lower bound. I have no clue about the derivative of a determinante nor of an inverse and it's only 3 points :P

See Problem 1 question 7 (Exercise 9 2019) http://www.da.inf.ethz.ch/teaching/2019/CIL/exercises/solution09.pdf

Topic Model and Nonnegative Matrix Factorisation

Word embeddings, Block Chain and Neural Networks

a)

b)

Rectified Linear Unit:

Sigmoid:

c)

  • localized features
  • translation invariance
  • weight sharing -> less parameters

Sparse Coding and Robust PCA

Fourier and wavelet transform

Part 1

a)

  • Easily invertible by transposing
  • Length preservation of signals

b)

  • Take low frequencies
  • Invert Fourier ??

c)

High frequency signals in the scarf and blanket patterns

d)

One can capture local extrema without causing a periodic repetition of it. So it's better for non-periodic signals

e)

  • True
  • False: if rank is full then determinante is either 1 or -1
  • True: Length preservation
  • False: Why shouldn't it?

Part 2

a)

b)

We also demand the solution to be sparse, i.e. contain as many 0 entries as possible

c)

  • Overcomplete dictionaries
  • Dictionary learning

Alternative solution:

  • Matching persuit
  • Convex relaxation (use 1-norm instead of 0-norm)

d)

  • Signal compression
  • Denoising of signals

Robust PCA

Because he is not moving.

Note: not covered in 2018

Optimisation

Convex Optimisation

  • True
  • False: (1,0) and (0,1) are in set but (0.5, 0.5) is not
  • False: and
  • False: (1,0) and (0,1)

Lagrange Duality

Dual function not covered in 2018

Optimisation methods

Part 1

a)

(0,0)

b)

c)

1 step is needed. 1 step is great

d)

oscillates between two points. takes longer until convergence (actually infinitely many steps)

e)

Not too large, not too small (very math-y I know)

(Maybe they meant the conditions ? Wasn't covered this year though)

Part 2

Coordinate Descent?

AFAIK not covered in 2018