Lösungsvorschlag Computational Intelligence Lab FS16
Inhaltsverzeichnis
Dimensionality reduction
PCA and SVD
live together in peace and harmony :) scnr
- False: Eigenvalue decomposition is performed on the centered variance-covariance matrix.
- True
- True: Rotating only changes the eigenvectors. The error is only influenced by the eigenvalues.
- False: The rating does depend on the initialization of the unknown values, therefore the performance is also affected.
- False: Depends on whether the unknown entries are also filled the same way. If they are the same, then it's True.
- False: No assumptions about the meaning of the feature vectors can be made.
PCA
Part 1
a)
Let be the SVD of . We then have to diagonalise :
b)
PCA has to find the Eigenvalue decomposition of . This is usually done via power method. Since is assumed to be large we perform many matrix multiplications with a huge matrix which is slow. In SVD we can first calculate and then do the eigenvalue decomposition on this way smaller matrix.
The solution above is incorrect, since the problems explicitly states that .
Part 2
a)
b)
We project the points onto the principle direction
And get the new coordinates
c)
.
The above mentioned solution is wrong. The task is to calculate the variance of the projected data. Therefore, .
d)
0, there is no error, as the first principal component explains all the data.
SVD
a)
b)
c)
(symmetric matrices have orthogonal EVs)
Clustering Mixture Models, NMF
Mixture Model and Expectation Maximization
a)
K-Means assigns each data point to exactly one centroid.
A GMM softly assignes a point to several clusters with the weights.
K-Means assumes sphere-shaped clusters of same size, GMM clusters can fit any ellipsoid shape (by fitting the covariance matrix).
b)
c)
d)
e)
Well just take the chain rule and derive the lower bound. I have no clue about the derivative of a determinante nor of an inverse and it's only 3 points :P
See Problem 1 question 7 (Exercise 9 2019) http://www.da.inf.ethz.ch/teaching/2019/CIL/exercises/solution09.pdf
Topic Model and Nonnegative Matrix Factorisation
- True: Summary sheet formula matches
- True: Summary sheet formula matches
- Semi-what?
- False: See Non-Negative Matrix Factorization Revisited: Uniqueness and Algorithm for Symmetric Decomposition. The introduction says the smallest is the non-negative rank of , which can be larger than the rank.
- False: The on the right side looks strange
Word embeddings, Block Chain and Neural Networks
a)
b)
Rectified Linear Unit:
Sigmoid:
c)
- localized features
- translation invariance
- weight sharing -> less parameters
Sparse Coding and Robust PCA
Fourier and wavelet transform
Part 1
a)
- Easily invertible by transposing
- Length preservation of signals
b)
- Take low frequencies
- Invert Fourier ??
c)
High frequency signals in the scarf and blanket patterns
d)
One can capture local extrema without causing a periodic repetition of it. So it's better for non-periodic signals
e)
- True
- False: if rank is full then determinante is either 1 or -1
- True: Length preservation
- False: Why shouldn't it?
Part 2
a)
b)
We also demand the solution to be sparse, i.e. contain as many 0 entries as possible
c)
- Overcomplete dictionaries
- Dictionary learning
Alternative solution:
- Matching persuit
- Convex relaxation (use 1-norm instead of 0-norm)
d)
- Signal compression
- Denoising of signals
Robust PCA
Because he is not moving.
Note: not covered in 2018
Optimisation
Convex Optimisation
- True
- False: (1,0) and (0,1) are in set but (0.5, 0.5) is not
- False: and
- False: (1,0) and (0,1)
Lagrange Duality
Dual function not covered in 2018
Optimisation methods
Part 1
a)
(0,0)
b)
c)
1 step is needed. 1 step is great
d)
oscillates between two points. takes longer until convergence (actually infinitely many steps)
e)
Not too large, not too small (very math-y I know)
(Maybe they meant the conditions ? Wasn't covered this year though)
Part 2
Coordinate Descent?
AFAIK not covered in 2018