特征表示 人脸由两个高斯变量的和表征: x = μ + ε x=\mu+\varepsilon x=μ+ε 这里 x x x代表人脸, μ \mu μ代表固有身份, ε \varepsilon ε代表脸部变化(光照,姿态,表情等)。隐变量 μ \mu μ代表固有身份, ε \varepsilon ε服从高斯分布 N ( 0 , Σ ε ) \ N(0, \Sigma \varepsilon) N(0,Σε), N ( 0 , S ε ) N\left(0, S_{\varepsilon}\right) N(0,Sε) 以照片 x i x_{i} xi和 x j x_{j} xj为例, x i = μ i + ϵ i , x j = μ j + ϵ j , i , j ∈ { 1 , 2 } x_{i}=\mu_{i}+\epsilon_{i}, x_{j}=\mu_{j}+\epsilon_{j}, i, j \in\{1,2\} xi=μi+ϵi,xj=μj+ϵj,i,j∈{1,2} cov ( x i , x j ) = cov ( μ i , μ j ) + cov ( μ i , ϵ j ) + cov ( ϵ i , μ j ) + cov ( ϵ i , ϵ j ) = cov ( μ i , μ j ) + 0 + 0 + cov ( ϵ i , ϵ j ) ( Because ϵ and μ are independent ) = cov ( μ i , μ j ) + cov ( ϵ i , ϵ j ) \begin{aligned} \operatorname{cov}\left(x_{i}, x_{j}\right) &=\operatorname{cov}\left(\mu_{i}, \mu_{j}\right)+\operatorname{cov}\left(\mu_{i}, \epsilon_{j}\right)+\operatorname{cov}\left(\epsilon_{i}, \mu_{j}\right)+\operatorname{cov}\left(\epsilon_{i}, \epsilon_{j}\right) \\ &=\operatorname{cov}\left(\mu_{i}, \mu_{j}\right)+0+0+\operatorname{cov}\left(\epsilon_{i}, \epsilon_{j}\right) \quad(\text { Because } \epsilon \text { and } \mu \text { are independent }) \\ &=\operatorname{cov}\left(\mu_{i}, \mu_{j}\right)+\operatorname{cov}\left(\epsilon_{i}, \epsilon_{j}\right) \end{aligned} cov(xi,xj)=cov(μi,μj)+cov(μi,ϵj)+cov(ϵi,μj)+cov(ϵi,ϵj)=cov(μi,μj)+0+0+cov(ϵi,ϵj)( Because ϵ and μ are independent )=cov(μi,μj)+cov(ϵi,ϵj) 判断依据推导 p ( x 1 , x 2 ∣ H I ) ∼ N ( 0 , Σ I ) , p ( x 1 , x 2 ∣ H E ) ∼ N ( 0 , Σ E ) p\left(x_{1}, x_{2} \mid H_{I}\right) \sim \mathcal{N}\left(0, \Sigma_{I}\right), \quad p\left(x_{1}, x_{2} \mid H_{E}\right) \sim \mathcal{N}\left(0, \Sigma_{E}\right) p(x1,x2∣HI)∼N(0,ΣI),p(x1,x2∣HE)∼N(0,ΣE) 在HI假设下,即两张脸属于同一个人,本征变量μ1,μ2是相同的,而且ϵ1,ϵ2是相互独立的 Σ I = [ S μ + S ϵ S μ S μ S μ + S ϵ ] \Sigma_{I}=\left[\begin{array}{cc}S_{\mu}+S_{\epsilon} & S_{\mu} \\ S_{\mu} & S_{\mu}+S_{\epsilon}\end{array}\right] ΣI=[Sμ+SϵSμSμSμ+Sϵ] 在HE假设下,μ,ϵ都是独立的 Σ E = [ S μ + S ϵ 0 0 S μ + S ϵ ] \Sigma_{E}=\left[\begin{array}{cc}S_{\mu}+S_{\epsilon} & 0 \\ 0 & S_{\mu}+S_{\epsilon}\end{array}\right] ΣE=[Sμ+Sϵ00Sμ+Sϵ] Σ I − 1 ( 仅 仅 作 为 标 记 ) = ( F + G G G F + G ) \Sigma_{I}^{-1}(仅仅作为标记)=\left(\begin{array}{cc} F+G & G \\ G & F+G \end{array}\right) ΣI−1(仅仅作为标记)=(F+GGGF+G) Σ E − 1 = ( ( S μ + S ϵ ) − 1 0 0 ( S μ + S ϵ ) − 1 ) \Sigma_{E}^{-1}=\left(\begin{array}{cc} \left(S_{\mu}+S_{\epsilon}\right)^{-1} & 0 \\ 0 & \left(S_{\mu}+S_{\epsilon}\right)^{-1} \end{array}\right) ΣE−1=((Sμ+Sϵ)−100(Sμ+Sϵ)−1) Σ E \Sigma_{E} ΣE, Σ I \Sigma_{I} ΣI带入高斯公式可得: p ( x 1 , x 2 ∣ H I ) = 1 ( 2 π ) m / 2 ∣ Σ I ∣ 1 2 exp ( − 1 2 ( x 1 x 2 ) Σ I − 1 ( x 1 x 2 ) ) p ( x 1 , x 2 ∣ H E ) = 1 ( 2 π ) m / 2 ∣ Σ E ∣ 1 2 exp ( − 1 2 ( x 1 x 2 ) Σ E − 1 ( x 1 x 2 ) ) \begin{array}{l} p\left(x_{1}, x_{2} \mid H_{I}\right)=\frac{1}{(2 \pi)^{m / 2}\left|\Sigma_{I}\right|^{\frac{1}{2}}} \exp \left(-\frac{1}{2}\left(x_{1} \quad x_{2}\right) \Sigma_{I}^{-1}\left(\begin{array}{c} x_{1} \\ x_{2} \end{array}\right)\right) \\ p\left(x_{1}, x_{2} \mid H_{E}\right)=\frac{1}{(2 \pi)^{m / 2}\left|\Sigma_{E}\right|^{\frac{1}{2}}} \exp \left(-\frac{1}{2}\left(x_{1} \quad x_{2}\right) \Sigma_{E}^{-1}\left(\begin{array}{c} x_{1} \\ x_{2} \end{array}\right)\right) \end{array} p(x1,x2∣HI)=(2π)m/2∣ΣI∣211exp(−21(x1x2)ΣI−1(x1x2))p(x1,x2∣HE)=(2π)m/2∣ΣE∣211exp(−21(x1x2)ΣE−1(x1x2)) 可以得到: r ( x 1 , x 2 ) = log p ( x 1 , x 2 ∣ H I ) p ( x 1 , x 2 ∣ H E ) r\left(x_{1}, x_{2}\right)=\log \frac{p\left(x_{1}, x_{2} \mid H_{I}\right)}{p\left(x_{1}, x_{2} \mid H_{E}\right)} r(x1,x2)=logp(x1,x2∣HE)p(x1,x2∣HI) = log ∣ Σ I ∣ − 1 2 exp ( − 1 2 ( x 1 x 2 ) Σ I − 1 ( x 1 x 2 ) ) ∣ Σ E ∣ − 1 2 exp ( − 1 2 ( x 1 x 2 ) Σ E − 1 ( x 1 x 2 ) ) =\log \frac{\left|\Sigma_{I}\right|^{-\frac{1}{2}} \exp \left(-\frac{1}{2}\left(x_{1} \quad x_{2}\right) \Sigma_{I}^{-1}\left(\begin{array}{l} x_{1} \\ x_{2} \end{array}\right)\right)}{\left|\Sigma_{E}\right|^{-\frac{1}{2}} \exp \left(-\frac{1}{2}\left(x_{1} \quad x_{2}\right) \Sigma_{E}^{-1}\left(\begin{array}{l} x_{1} \\ x_{2} \end{array}\right)\right)} =log∣ΣE∣−21exp(−21(x1x2)ΣE−1(x1x2))∣ΣI∣−21exp(−21(x1x2)ΣI−1(x1x2)) = log [ exp ( − 1 2 ( x 1 x 2 ) ( Σ E − 1 − Σ I − 1 ) x 2 ) ) ⋅ ∣ Σ I ∣ − 1 2 ∣ Σ E ∣ − 1 2 = − 1 2 ( x 1 x 2 ) ( Σ E − 1 − Σ I − 1 ) x 2 ) + C 1 ( here C 1 = log ∣ Σ I ∣ − 1 2 ∣ Σ E ∣ − 1 2 ) = 1 2 ( x 1 x 2 ) ( ( ( S μ + S ϵ ) − 1 0 0 ( S μ + S ϵ ) − 1 ) − ( F + G G G F + G ) ) ( x 1 x 2 ) + C 1 = 1 2 ( x 1 x 2 ) T ( A − G − G A ) ( x 1 x 2 ) + C 1 = 1 2 ( x 1 T A x 1 − 2 x 1 T G x 2 + x 2 T x 2 ) + C 1 \begin{aligned} &=\log \left[\exp \left(-\frac{1}{2}\left(x_{1} \quad x_{2}\right)\left(\begin{array}{c} \left.\Sigma_{E}^{-1}-\Sigma_{I}^{-1}\right) \\ x_{2} \end{array}\right)\right) \cdot \frac{\left|\Sigma_{I}\right|^{-\frac{1}{2}}}{\left|\Sigma_{E}\right|^{-\frac{1}{2}}}\right.\\ &=-\frac{1}{2}\left(\begin{array}{ll} x_{1} & x_{2} \end{array}\right)\left(\begin{array}{l} \left.\Sigma_{E}^{-1}-\Sigma_{I}^{-1}\right) \\ x_{2} \end{array}\right)+C_{1} \quad\left(\text { here } C_{1}=\log \frac{\left|\Sigma_{I}\right|^{-\frac{1}{2}}}{\left|\Sigma_{E}\right|^{-\frac{1}{2}}}\right)\\ &=\frac{1}{2}\left(\begin{array}{ll} x_{1} & x_{2} \end{array}\right)\left(\left(\begin{array}{cc} \left(S_{\mu}+S_{\epsilon}\right)^{-1} & 0 \\ 0 & \left(S_{\mu}+S_{\epsilon}\right)^{-1} \end{array}\right)-\left(\begin{array}{cc} F+G & G \\ G & F+G \end{array}\right)\right)\left(\begin{array}{l} x_{1} \\ x_{2} \end{array}\right)+C_{1}\\ &=\frac{1}{2}\left(\begin{array}{ll} x_{1} & x_{2} \end{array}\right)^{T}\left(\begin{array}{cc} A & -G \\ -G & A \end{array}\right)\left(\begin{array}{ll} x_{1} & x_{2} \end{array}\right)+C_{1}\\ &=\frac{1}{2}\left(x_{1}^{T} A x_{1}-2 x_{1}^{T} G x_{2}+x_{2}^{T} x_{2}\right)+C_{1} \end{aligned} =log[exp(−21(x1x2)(ΣE−1−ΣI−1)x2))⋅∣ΣE∣−21∣ΣI∣−21=−21(x1x2)(ΣE−1−ΣI−1)x2)+C1( here C1=log∣ΣE∣−21∣ΣI∣−21)=21(x1x2)(((Sμ+Sϵ)−100(Sμ+Sϵ)−1)−(F+GGGF+G))(x1x2)+C1=21(x1x2)T(A−G−GA)(x1x2)+C1=21(x1TAx1−2x1TGx2+x2Tx2)+C1 可得 A = ( S μ + S ϵ ) − 1 − ( F + G ) A=\left(S_{\mu}+S_{\epsilon}\right)^{-1}-(F+G) A=(Sμ+Sϵ)−1−(F+G) 推导 Σ x \Sigma_{x} Σx x 1 = μ + ϵ 1 , x 2 = μ + ϵ 2 , … , x m = μ + ϵ m x_{1}=\mu+\epsilon_{1}, \quad x_{2}=\mu+\epsilon_{2}, \quad \ldots, \quad x_{m}=\mu+\epsilon_{m} x1=μ+ϵ1,x2=μ+ϵ2,…,xm=μ+ϵm x = ( x 1 x 2 ⋮ x m ) m ∗ 1 , P = ( 1 1 0 … 0 1 0 1 … 0 ⋮ ⋮ ⋮ ⋱ ⋮ 1 0 0 … 1 ) ( m ) ∗ ( m + 1 ) , h = ( μ 1 ϵ 2 ⋮ ϵ m ) ( m + 1 ) ∗ 1 x=\left(\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{m} \end{array}\right)_{m * 1}, \quad P=\left(\begin{array}{ccccc} 1 & 1 & 0 & \dots & 0 \\ 1 & 0 & 1 & \dots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & 0 & 0 & \dots & 1 \end{array}\right)_{(m) *(m+1)}, \quad h=\left(\begin{array}{c} \mu_{1} \\ \epsilon_{2} \\ \vdots \\ \epsilon_{m} \end{array}\right)_{(m+1) * 1} x=⎝⎜⎜⎜⎛x1x2⋮xm⎠⎟⎟⎟⎞m∗1,P=⎝⎜⎜⎜⎛11⋮110⋮001⋮0……⋱…00⋮1⎠⎟⎟⎟⎞(m)∗(m+1),h=⎝⎜⎜⎜⎛μ1ϵ2⋮ϵm⎠⎟⎟⎟⎞(m+1)∗1 x = P h x=P h x=Ph h ∼ N ( 0 , Σ h ) , Σ h = ( S μ 0 ⋯ 0 0 S ϵ ⋯ 0 ⋮ ⋮ ⋱ ⋮ 0 0 ⋯ S ϵ ) h \sim \mathcal{N}\left(0, \Sigma_{h}\right), \quad \Sigma_{h}=\left(\begin{array}{cccc} S_{\mu} & 0 & \cdots & 0 \\ 0 & S_{\epsilon} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & S_{\epsilon} \end{array}\right) h∼N(0,Σh),Σh=⎝⎜⎜⎜⎛Sμ0⋮00Sϵ⋮0⋯⋯⋱⋯00⋮Sϵ⎠⎟⎟⎟⎞ 依据 Cov [ A x , B y ] = A Cov [ x , y ] B T \operatorname{Cov}[\mathbf{A} \mathbf{x}, \mathbf{B y}]=\mathbf{A} \operatorname{Cov}[\mathbf{x}, \mathbf{y}] \mathbf{B}^{T} Cov[Ax,By]=ACov[x,y]BT x ∼ N ( 0 , Σ x ) , Σ x = P Σ h P T = ( S μ + S ϵ S μ … S μ S μ S μ + S ϵ … S μ ⋮ ⋮ ⋱ ⋮ S μ S μ … S μ + S ϵ ) x \sim \mathcal{N}\left(0, \Sigma_{x}\right), \quad \Sigma_{x}=P \Sigma_{h} P^{T}=\left(\begin{array}{cccc} S_{\mu}+S_{\epsilon} & S_{\mu} & \dots & S_{\mu} \\ S_{\mu} & S_{\mu}+S_{\epsilon} & \dots & S_{\mu} \\ \vdots & \vdots & \ddots & \vdots \\ S_{\mu} & S_{\mu} & \dots & S_{\mu}+S_{\epsilon} \end{array}\right) x∼N(0,Σx),Σx=PΣhPT=⎝⎜⎜⎜⎛Sμ+SϵSμ⋮SμSμSμ+Sϵ⋮Sμ……⋱…SμSμ⋮Sμ+Sϵ⎠⎟⎟⎟⎞ 关于推导
依据: Assume x ∼ N x ( μ , Σ ) where x = [ x a x b ] μ = [ μ a μ b ] Σ = [ Σ a Σ c Σ c T Σ b ] \begin{array}{l} \text { Assume } \mathbf{x} \sim \mathcal{N}_{\mathbf{x}}(\mu, \mathbf{\Sigma}) \text { where } \\ \qquad \mathbf{x}=\left[\begin{array}{c} \mathbf{x}_{a} \\ \mathbf{x}_{b} \end{array}\right] \quad \boldsymbol{\mu}=\left[\begin{array}{c} \boldsymbol{\mu}_{a} \\ \boldsymbol{\mu}_{b} \end{array}\right] \quad \mathbf{\Sigma}=\left[\begin{array}{cc} \boldsymbol{\Sigma}_{a} & \boldsymbol{\Sigma}_{c} \\ \boldsymbol{\Sigma}_{c}^{T} & \boldsymbol{\Sigma}_{b} \end{array}\right] \end{array} Assume x∼Nx(μ,Σ) where x=[xaxb]μ=[μaμb]Σ=[ΣaΣcTΣcΣb] p ( x a ∣ x b ) = N x a ( μ ^ a , Σ ^ a ) { μ ^ a = μ a + Σ c Σ b − 1 ( x b − μ b ) Σ ^ a = Σ a − Σ c Σ b − 1 Σ c T p\left(\mathbf{x}_{a} \mid \mathbf{x}_{b}\right)=\mathcal{N}_{\mathbf{x}_{a}}\left(\hat{\mu}_{a}, \hat{\mathbf{\Sigma}}_{a}\right) \quad\left\{\begin{array}{l} \hat{\boldsymbol{\mu}}_{a}=\boldsymbol{\mu}_{a}+\boldsymbol{\Sigma}_{c} \mathbf{\Sigma}_{b}^{-1}\left(\mathbf{x}_{b}-\boldsymbol{\mu}_{b}\right) \\ \hat{\mathbf{\Sigma}}_{a}=\boldsymbol{\Sigma}_{a}-\boldsymbol{\Sigma}_{c} \mathbf{\Sigma}_{b}^{-1} \boldsymbol{\Sigma}_{c}^{T} \end{array}\right. p(xa∣xb)=Nxa(μ^a,Σ^a){μ^a=μa+ΣcΣb−1(xb−μb)Σ^a=Σa−ΣcΣb−1ΣcT 因此 求解F,G 根据相乘玩对角线为一,非对角线为零可得
流程图: 1.初始化 2,EMlike算法 E步: M步:更新参数{Sμ,Sϵ}
3.联合贝叶斯判据