Solution 2 Computational Intelligence Lab 2011

Problem 1

1. 6 x 6, correlation between movies
2. 5 x 5, correlation between users
3. U = 6 x 6, D = 6 x 5, V = 5 x 5
4. Identify Concepts and find hidden structures. Due to the insertion of the average rating we introduce some noise. After the decomposition we hope to find the "true" values by hoping that the data is explained by less dimmensions (${\displaystyle k)
5. The 3 or 4 highest ones
6. Affinity of the movies to the concepts
7. Affinity of the users to the concepts
8. Plot. Interpretation: "Natural" clustering around the mean of the values of the first principal component (${\displaystyle {\bar {u_{1}}}=-0.4}$)
9. See above.
10. Strengths/expressiveness of each concept
11. ${\displaystyle A_{3}=U_{3}*D_{3}*V_{3}^{T}=\sum _{i=1}^{3}d_{i}*{\vec {u}}_{i}*{\vec {v}}_{i}^{T}}$
12. ${\displaystyle \|A-A_{3}\|_{2}=\sigma _{4}=2.75}$, ${\displaystyle \|A-A_{3}\|_{F}={\sqrt {\sigma _{4}^{2}+\sigma _{5}^{2}}}={\sqrt {(2.75)^{2}+(0.67)^{2}}}}$
13. ${\displaystyle Bob=[1,*,*,6,*,10]^{T}}$
1. (a) Transform into 2 dimensional representation without doing an SVD again: ${\displaystyle A_{2}=U_{N,3}*D_{2,2}*V_{2,N}^{T}\leftrightarrow V_{2,N}^{T}=D_{2,2}^{-1}*U_{N,3}^{T}*A_{2}}$
2. (a) Fill in the missing values with the average: ${\displaystyle Bob=[1,5.5,5.5,6,5.5,10]^{T}}$
3. (a) Representation in the "new coordinates": ${\displaystyle Bob_{2,1}^{T}=D_{2,2}^{-1}*U_{N,3}^{T}*Bob_{2,1}=D_{2,2}^{-1}*U_{N,3}^{T}*[1,5.5,5.5,6,5.5,10]^{T}=[-0.46,0.36]^{T}}$
4. (b) Find the nearesst neighbour (scaled with ${\displaystyle D}$)
14. As long we do not take the new user-data into account, it would not affect our prediction system. We just use our existing decomposition.