多元正态分布 Multivariate Normal Distribution

摘要

这篇文章的目标是多元正态分布的概率密度函数。

独立多元正态分布

为了推导多元分布的一般形式,我们先从基本形式开始。假设一组互相独立的变量, \[ \bf{X_i} \sim \mathcal{N}(0, \sigma_i^2) \quad \forall i = 1, 2, \dots, n \] 我们记它们的密度函数: \[ f_{X_i}(x) = \frac{1}{\sqrt{2\pi}\sigma_i}\exp({-\frac{x^2}{2\sigma_i^2}}) \] 因为变量的独立性, 联合概率密度函数为各变量密度函数的乘积: \[ \begin{align*} f_X({\bf x}) &= \prod_{i=1}^nf_{X_i}(x_i)\\ &=\prod_{i=1}^n\frac{1}{\sqrt{2\pi}\sigma_i}\exp({-\frac{x_i^2}{2\sigma_i^2}})\\ &=\frac{1}{(\sqrt{2\pi})^n} \times \frac{1}{\prod_{i=1}^n\sqrt{\sigma_i^2}}\times \exp(\sum_{i=1}^n-\frac{x_i^2}{2\sigma_i^2})\\ &=(2\pi)^{-\frac{n}{2}}\times\frac{1}{\sqrt{\det(\Sigma)}}\times \exp(-\frac{1}{2}{\bf x}^T \text{diag}(\frac{1}{\sigma_1^2}, \frac{1}{\sigma_2^2}, \dots, \frac{1}{\sigma_n^2}){\bf x})\\ &=(2\pi)^{-\frac{n}{2}}\det(\Sigma)^{-\frac{1}{2}}\exp(-\frac{1}{2}{\bf x}^T\Sigma^{-1}{\bf x}) \end{align*} \] 注意此处: \[ \Sigma = \text{diag}(\sigma_1^2, \sigma_2^2, \dots, \sigma_n^2)\\ \det(\Sigma) = \sigma_i^2\sigma_2^2 \cdots \sigma_n^2\\ \Sigma^{-1} = \text{diag}(\frac{1}{\sigma_1^2}, \frac{1}{\sigma_2^2}, \dots, \frac{1}{\sigma_n^2}) \] 所以,我们得到了在各分量互相独立和均值为0条件下的多元正态分布: \[ {\bf X} \sim \mathcal{N}({\bf 0}, \Sigma) \]

随机变量的仿射变换

考察仿射变换: \[ L({\bf x}) =A{\bf x}+ \bf b \] 其中\(A\)是大小为n*n的可逆矩阵,\(\bf b\)是长度为n的常数向量。

我们记变换后随机变量组为\(\bf Y\)及其期望: \[ \begin{align*} {\bf Y}&=L({\bf X})\\ E{\bf Y}&=E(A{\bf X}+{\bf b})\\ &=A(E{\bf X})+{\bf b}\\ &=L(E{\bf X})\\ \end{align*} \] 协方差: \[ \begin{align*} \Sigma_{\bf Y}&=E[({\bf Y}-E{\bf Y})({\bf Y}-E{\bf Y})^T]\\ &= E[\big(A{\bf X}+{\bf b}-A(E{\bf X}+{\bf b})\big)\big(A{\bf X}+{\bf b}-A(E{\bf X}+{\bf b})\big)^T]\\ &= E[A({\bf X}-E{\bf X})\big(A({\bf X}-E{\bf X})\big)^T]\\ &= E[A({\bf X}-E{\bf X})({\bf X}-E{\bf X})^TA^T]\\ &=A\Sigma_XA^T \end{align*} \] 所以出发点是,假设我们有一组随机变量\(\bf Z\)符合多元正态分布且各分量互相独立,均值为\(0\),方差为\(1\)\[ {\bf Z} \sim \mathcal{N}({\bf 0}, I) \] 和一组随机变量\(\bf Y\)符合一般多元正态分布: \[ {\bf Y} \sim \mathcal{N}({\bf \mu_{\bf Y}}, \Sigma_{\bf Y}) \] 我们希望寻找一个仿射变换\(L({\bf x})=A{\bf x}+b\)使得: \[ \mu_{\bf Y} = {\bf b}\\ \Sigma_{\bf Y} = AIA^T \]\(\Sigma_{\bf Y}\)为实对称矩阵,使用特征分解: \[ \Sigma_{\bf Y} = Q \Lambda Q^{-1} = Q \Lambda Q^{T} \]\(n \times n\)矩阵\(U\)\[ U = Q\text{diag}(\sqrt \sigma_1, \sqrt \sigma_2, ..., \sqrt \sigma_n)\\ UU^T=Q\text{diag}(\sqrt \sigma_1, \sqrt \sigma_2, ..., \sqrt \sigma_n)\text{diag}(\sqrt \sigma_1, \sqrt \sigma_2, ..., \sqrt \sigma_n)^TQ^T\\ UU^T=Q\Lambda Q^T=\Sigma_{\bf Y} \] 得到所需仿射变换: \[ {\bf y} =U{\bf z} + \mu_{\bf Y} \]

仿射变换与概率密度函数

现在我们得到了有限制条件下的多元正态分布\({\bf Z} \sim \mathcal{N}({\bf 0}, I)\)和所需的仿射变换\({\bf Y} = L({\bf Z})\)

所以出发点是,我们需要用仿射变换的概率密度函数把有限制条件情况的概率密度函数对一般情况进行推广: \[ \begin{align*} f_{\bf Y}({\bf y}) &= f_{\bf Z}({\bf Z})|\det(\frac{d{\bf z}}{d{\bf y}})|\\ &=f_{\bf Z}\big(L^{-1}({\bf y})\big)|\det(\frac{d{\bf z}}{d{\bf y}})| \end{align*} \] 其中\(L^{-1}(\bf y)\)\(L(\bf x)\)的逆变换: \[ {\bf z} = U^{-1}({\bf y}-\mu_{\bf Y})\\ \frac{d{\bf z}}{d{\bf y}}=U^{-1}\\ \det(U^{-1})=\frac{1}{\det(U)}\\ \det(\Sigma_{\bf Y}) = \det(UU^T)=\det(U)\det(U^{T}) = \det(U)^2 \geq 0\\ |\det(U^{-1})|=\frac{1}{\sqrt{\det(\Sigma_{\bf Y})}} \]

多元正态分布的概率密度函数

我们有\({\bf Z} \sim \mathcal{N}({\bf 0}, I)\)的概率密度函数: \[ \begin{align*} f_{\bf Z}(z)&=(2\pi)^{-\frac{n}{2}}\det(\Sigma_{\bf Z})^{-\frac{1}{2}}\exp(-\frac{1}{2}{\bf x}^T\Sigma_{\bf Z}^{-1}{\bf x})\\ &=(2\pi)^{-\frac{n}{2}}\det(I)^{-\frac{1}{2}}\exp(-\frac{1}{2}{\bf x}^TI^{-1}{\bf x})\\ &=(2\pi)^{-\frac{n}{2}}\exp(-\frac{1}{2}{\bf x}^T{\bf x}) \end{align*} \] 所以,\(\bf Y\sim \mathcal N(\mu_{\bf Y}, \Sigma_{\bf Y})\)的概率密度函数: \[ \begin{align*} f_{\bf Y}({\bf y})&=f_{\bf X}\big(U^{-1}({\bf y}-\mu_{\bf Y})\big)|\det(U^{-1})|\\ &=(2\pi)^{-\frac{n}{2}}\exp\Big(-\frac{1}{2}\big(U^{-1}({\bf y}-\mu_{\bf Y})\big)^T\big(U^{-1}({\bf y}-\mu_{\bf Y})\big)\Big)\frac{1}{\sqrt{\det(\Sigma_{\bf Y})}}\\ &=(2\pi)^{-\frac{n}{2}}\det(\Sigma_{\bf Y})^{-\frac{1}{2}}\exp\big(-\frac{1}{2}({\bf y}-\mu_{\bf Y})^T(U^{-1})^TU^{-1}({\bf y}-\mu_{\bf Y})\big)\\ &=(2\pi)^{-\frac{n}{2}}\det(\Sigma_{\bf Y})^{-\frac{1}{2}}\exp\big(-\frac{1}{2}({\bf y}-\mu_{\bf Y})^T\Sigma_{\bf Y}^{-1}({\bf y}-\mu_{\bf Y})\big)\\ \end{align*} \]