Homework for Course MSBD5004 Mathematical Methods for Data Analysis, BDT, HKUST
Question 1
Consider the vector space $R^{n}$
(a)
Check that $\|x\|_{\infty}=\max_{1 \leq i \leq n} |x_{1}|$ is indeed a norm on $\mathbb{R}^n$
According to the definition of norm , we check
(1)
$\begin{aligned} \| x\|_{\infty} =\max _{1 \leq i \leq n}\left|x_{i}\right| \geq 0, \forall x \in \mathbb{R}^{n}\end{aligned} $
and $\| x\|_{\infty} = 0 \Leftrightarrow |x_i|=0 \Leftrightarrow x=\overrightarrow{0}$
(2)
$\begin{aligned} \|\alpha x\|_{\infty}=\max _{1 \leq i \leq n}|a x_{i}|=|\alpha| \cdot \max _{1 \leq i \leq n}\left|x_{i}\right|=|\alpha| \cdot \| x\|_{\infty}, \forall x \in \mathbb{R}, \alpha \in \mathbb{R}\end{aligned}$
(3)
$\begin{aligned}\|x+y\|_{\infty}=\max _{1 \leq i \leq n}\left|x_{i}+y_{i}\right| =\max_{1 \leq i \leq n} \left| x_i \right| +\max_{1 \leq i \leq n}\left| y_{i}\right|=\|x\|_{\infty}+\|y\|_{\infty}, \quad \forall x, y \in \mathbb{R}^{n}\end{aligned}$
Therefore, $\|x\|_{\infty}=\max _{1 \leq i \leq n}\left|x_{i}\right|$ is a norm on $\mathbb{R}^{n}$
(b)
Prove that: for any $x \in \mathbb{R}^{n}, \quad\|x\|_{\infty}= lim_{p \rightarrow \infty} \|x\|_p$
Prove:
$$ \begin{aligned} &\|x\|_{p}=\left(\sum_{i=1}^{n}\left|x_{i}\right|^{p}\right)^{1 / p} \\ &=\left(\left|x_{1}\right|^{p}+\left|x_{x}\right|^{p}+\cdots+\left|x_{n}\right|^{p}\right)^{1 / p} \end{aligned} $$
Let
$$ \left|x_{m}\right|=\max \left\{\left|x_{1}\right|,\left|x_{2}\right|, \cdots,\left|x_{n}\right|\right\} $$
Because
$$ \frac{\left|x_{1} \right|}{\left|x_{m} \right|}, \frac{\left|x_{2} \right|}{\left|x_{m}\right|}, \ldots, \frac{\left|x_{n}\right|}{\left|x_{m}\right|}<1 $$
Then we can get this inequality
$$ 1 \cdot |x_m| \leq \|x\|_{p}=\left[\left(\frac{x_1}{x_{m}}\right)^{p}+\left(\frac{x_2}{x_{m}}\right)^{p}+ \cdots + \left(\frac{x_n}{x_{m}}\right)^{p} \right]^{\frac{1}{p}} \cdot |x_m| \leq (n)^{\frac{1}{p}} \cdot |x_m| $$
when $p \rightarrow \infty, (n)^{\frac{1}{p}} \rightarrow 1$ so $\| x \|_p \rightarrow |x_m|$
Finally
$\lim _{p \rightarrow \infty}\|x\|_{p}=\left|x_{m}\right|=\max _{1 \leq i \leq n}\left|x_{i}\right| = \|x\|_\infty$. Hence proved.
(c)
Prove the equivalence
$$\|x\|_{\infty} \leqslant \| x\left\|_{1} \leqslant n\right\|x \|_{\infty}, \forall x \in \mathbb{R}^{n}$$
Prove: According to the definition
Obviously, $\max_{1 \leq i \leq n} \left|x_{i}\right|$ is taken from$\left\{\left|x_{i}\right|,\left|x_{2}\right|, \cdots,\left|x_{n}\right|\right\}$
So, $\| x \|_{1}=\sum_{i=1}^{n}\left|x_{1}\right| \geqslant \max _{1 \leq i \leq n}\left|x_{i}\right|= \|x\|_\infty$
Then
$$ \left|x_{1}\right| \leq \max_{1 \leq i \leq n} |x_i|\\ \left|x_{2}\right| \leq \max_{1 \leq i \leq n} |x_i|\\ \vdots \\ \left|x_{n}\right| \leq \max_{1 \leq i \leq n}|x_i| $$
Therefore $\sum_{i=1}^{n}\left|x_{i}\right|=n \cdot \max_{1 \leq i \leq n} |x_i|=n \cdot\|x\|_{\infty}$
Altogether, $\|x\|_{\infty} \leqslant\|x\|_{1} \leqslant n \cdot\|x\|_{\infty}$
Question 2
For any $A \in \mathbb{R}^{m \times n}$, we have defined
$$\|A\|_{2}=\sup _{x \in \mathbb{R}^{n}, x \neq 0} \quad \frac{\|A x\|_{2}}{\|x\|_{2}}$$
(a)
Prove that
$$ \|A\|_{2}=\max _{x \in \mathbb{R}^n, \|x\|_2=1}\|A x\|_{2} $$
Prove :
$$ \begin{aligned} \|A\|_2 &= \sup _{x \in \mathbb{R}^{n}, x \neq 0}\frac{\|A x\|_{2}}{\|x\|_{2}}\\ &= \sup _{x \in \mathbb{R}^{n}, x \neq 0}\left\|\frac{A x}{\|x\|_{2}}\right\|_{2} \\ &= \sup _{x \in \mathbb{R}^{n}, x \neq 0}\left\|A \cdot \frac{x}{\|x\|_{2}}\right\|_{2} \end{aligned} $$
Note that $\frac{x}{\|x\|_2}$ has unit length. So we can express
$$\|A\|_2=\sup _{x \in \mathbb{R}^{n},\|x\|_{2}=1} \|A x \|_{2}$$
However, $\|A\|_2$ is a continuous function of $x$ and the unit ball $\|x\|_{2}=1$ is closed and bounded. Now, on a closed and bounded set, a continuous function always achieves its maximum and minimum values.
Hence, the 'sup' can be replaced by 'max'. i.e.
$$ \|A\|_{2}=\max _{x \in R^{n},\|x\|_{2}=1}\|A x\|_{2} $$
(b)
Prove that $\|\cdot\|_2$ is a norm on $\mathbb{R}^{m \times n}$
Prove: According to the definition of norm we check
(1) $\forall A \in \mathbb{R}^{m \times n}$
$$ \begin{aligned} \|A\|_{2}=\max _{x \in \mathbb{R}^n, \|x\|_2=1}\|A x\|_{2} \geqslant 0 \end{aligned} $$
when $\begin{aligned}\max _{x \in \mathbb{R}^n, \|x\|_2=1}\|A x\|_{2} = 0\end{aligned}$ with $\|A x\|_{2} \geq 0$, we have
$$ \|A x\|_{2} = 0, s.t. \|x\|_2=1, x \in \mathbb{R}^n \Rightarrow A=O $$
Thus, $\|A x\|_{2} = 0 \Leftrightarrow A=O$
(2) $\forall \alpha \in \mathbb{R}, A \in \mathbb{R}^{m \times n}$
$$ \begin{aligned} \|\alpha A\|_{2} &= \max _{x \in \mathbb{R}^n, \|x\|_2=1}\|\alpha A x\|_{2} \\ &= |\alpha| \cdot \max _{x \in \mathbb{R}^n, \|x\|_2=1} \|A x\|_{2}\\ &= |\alpha| \cdot \|A\|_2 \end{aligned} $$
(3) $\forall A,B \in \mathbb{R}^{m \times n}$
$$ \begin{aligned} \|A + B\|_{2} &= \max _{x \in \mathbb{R}^n, \|x\|_2=1}\|(A+B)x\|_{2} \\ &= \max _{x \in \mathbb{R}^n, \|x\|_2=1}\|Ax+Bx\|_{2} \\ &\leq \max _{x \in \mathbb{R}^n, \|x\|_2=1} \left(\|Ax\|_{2} + \|Bx\|_{2}\right) \\ &= \max _{x \in \mathbb{R}^n, \|x\|_2=1}\|Ax\|_{2} + \max _{x \in \mathbb{R}^n, \|x\|_2=1}\|Bx\|_{2}\\ &= \|A\|_2 + \|B\|_2 \end{aligned} $$
Altogether, $\|\cdot\|_{2}$ is a norm on $\mathbb{R}^{m \times n}$
(c)
Prove that $\|A x\|_{2} \leqslant\|A\|_{2}\|x\|_{2}$ for any $A \in \mathbb{R}^{n \times n}$ and $x \in \mathbb{R}^{n}$
Prove: Consider the definition of matrix 2-norm, $\forall A \in \mathbb{R}^{m \times n}$
$$\|A\|_{2}=\sup _{x \in \mathbb{R}^{n}, x \neq 0} \quad \frac{\|A x\|_{2}}{\|x\|_{2}}$$
It reveals that
$$ \|A x\|_{2} \leqslant \|A\|_2 \cdot \|x\|_{2} $$
Hence proved
(d)
Prove that $\|A B\|_{2} \leqslant\|A\|_{2} \cdot\|B\|_{2}$ for all $A \in \mathbb{R}^{m \times n}$ and $B \in \mathbb{R}^{n \times p}$
Prove: In Question (c) we have proved $\|A x\|_{2} \leqslant\|A\|_{2}\|x\|_{2}$, $A \in \mathbb{R}^{n \times n}$, $x \in \mathbb{R}^{n}$
use this inequality twice. Let $B \in \mathbb{R}^{n \times p}, y \in \mathbb{R}^p$
$$ \begin{aligned} \|A B y\|_{2} &=\|A(B y)\|_{2} \\ &\leqslant\|A\|_{2} \cdot\|B y\|_{2} \\ &\leqslant\|A\|_{2} \cdot \|B\|_{2} \cdot\|y\|_{2} \end{aligned} $$
Then for $y \neq 0$
$$ \begin{array}{l} \frac{\|A B y\|_{2}}{\|y\|_{2}} \leqslant\|A\|_{2} \cdot \|B\|_2 \\ \left\|A B \cdot \frac{y}{\|y\|_{2}}\right\|_{2} \leqslant\|A\|_{2} \cdot\|B\|_{2} \end{array} $$
Hence, $\|A B\|_{2} \leqslant\|A\|_{2} \cdot \| B \|_{2},$ proved
Question 3
Let $a_{1}, a_{2}, \cdots, a_{m}$ be the $m$ given real numbers. Prove that a median of $a_{1}, a_{2}, \dots, a_{m}$ minimizes
$$|a_{1} - b|+\left|a_{2}-b\right|+\cdots+\left|a_{m}-b\right|$$
over all $b \in R$
Prove: To prove a median can minimize $\left|a_{1}-b\right|+\left|a_{2}-b\right|+\dots+\left|a_{m}-b\right|$ is equivalent to prove a median minimizes
$$ \text { Let } f(b)=\left(a_{1}-b\right)^{2}+\left(a_{2}-b\right)^{2}+\cdots+\left(a_{n}-b\right)^{2} $$
$$ \begin{aligned} f^{\prime}(b)&=-2\left[\left(a_{1}-b\right)+\left(a_{2}-b\right)+\cdots+\left(a_{m-b}\right)\right] \\ &=-2\left[\left(a_{1}+a_{2}+\dots+a_{m}\right)-m b\right] \end{aligned} $$
Make $f^{\prime}(b)=0, \text { get } b=\frac{1}{m}\left(a_{1}+a_{2}+\dots+a_{m}\right)$
Due to $f^{\prime \prime}(t)=2 m>0$
Therefore when $b=\frac{1}{m}\left(a_{1}+a_{2}+\dots+a_{m}\right), f(b)$ achieves its minimum
i.e. the median minimizes $\left|a_{1}-b\right|+\left|a_{2}-b\right|+\dots+\left|a_{m}-b\right|$
Question 4
Suppose that the vectors $x_{1}, \ldots . x_{N}$ in $\mathbb{R}^{n}$ are clustered using the K-means algorithm, with group representative $z_1, \cdots, z_{k}$
(a) Suppose the original vectors $x_i$ are nonnegative. i.e. their entries are nonnegative. Explain why the representatives $z_j$ output by the k-means algorithm are also nonnegtive.
Answer: we see the algorithm step 2
Fix group $G_{1}, G_{2}, \cdots, G_{k}$ and solve
$$ \min _{z_1, \cdots, z_k} \sum_{j=1}^{k}\left(\sum_{i \in G_j}\left\|x_{i}-z_{j}\right\|_{2}^{2}\right) $$
$$ \text { that } \quad z_{j}=\frac{1}{\left|G_{j}\right|}\left(\sum_{i \in G_j} x_{i}\right) $$
Because $x_i$ are nonnegative, according to the equation, $z_j$ is also nonnegative.
(b) Suppse the original vector $x_i$ represent proportions, i.e. their entries are nonnegative and sum to one. Explain why the representatives
$z_j$ output by k-means algorithms are also represent proportions.
Answer: Still consider the equation above, rewrite it to vector form
$$ z_{i}=\frac{1}{\left|G_{i}\right|} \sum_{i \in G_{j}}\left(\begin{array}{l} x_{i1} \\ x_{i2} \\ \vdots \\ x_{in} \end{array}\right) $$
$$ =\frac{1}{\left|G_{i}\right|} \left(\begin{array}{c} \sum_{i \in G_{j}} x_{i1} \\ \sum_{i \in G_{j}} x_{i2} \\ \vdots\\ \sum_{i \in G_{j}} x_{in} \end{array}\right) $$
For $x_{i}+x_{i 2}+\cdots+x_{i n}=1, \quad \sum_{i \in G_{j}} x_{i1}+\sum_{i \in G_{j}} x_{i2}+\cdots+ \sum_{i \in G_{j}} x_{in} = |G_j|$
We can get
$$ \frac{1}{\left|G_{i}\right|} \left(\sum_{i \in G_{j}} x_{i1}+\sum_{i \in G_{j}} x_{i2}+\cdots+ \sum_{i \in G_{j}} x_{in}\right)=1 $$
and $z_j$ nonnegstive is obvious. Hence proved.
(c) Suppose the original vector $x_i$ are Boolean. i.e. their entries are either 0 or 1. Give an interpretation of $(z_j)_i$, the i-th entry of the $j$ group representative.
Answer:
- If $(z_j)_i = 0$, it means in group $j$, the number of vectors with i-th entry ZERO is more than those with i-th entry ONE
- If $(z_j)_i = 1$ it means in group $j$, the number of vectors with i-th entry ONE is more that those with i-th entry ZERO
本博客文章除特别声明外,均可自由转载与引用,转载请标注原文出处:http://www.yelbee.top/index.php/archives/180/