下面讨论一个组合问题,假设我们要把 n n n个物体分成 r r r组,每一组有 n 1 , n 2 , ⋯ , n r n_1,n_2,\cdots,n_r n1,n2,⋯,nr个物体,请问一共有多少种分组方式?
首先我们分第一组,从 n n n个物体中取出 n 1 n_1 n1个作为第一组,取法有 C n n 1 C_n^{n_1} Cnn1种;接下来分第二组,从剩余的 n − n 1 n-n_1 n−n1个物体中取出 n 2 n_2 n2个作为第二组,取法有 C n − n 1 n 2 C_{n-n_1}^{n_2} Cn−n1n2种,按这个流程做下去,直到最后剩下 n − n 1 − ⋯ − n r − 1 n-n_1-\cdots-n_{r-1} n−n1−⋯−nr−1个物体,因为 n 1 + n 2 + ⋯ + n r = n n_1+n_2+\cdots+n_r = n n1+n2+⋯+nr=n
因此最后剩下 n r n_r nr个物体,就作为第 r r r组。下面计算一下总的分组方式数目: C n n 1 C n − n 1 n 2 ⋯ C n − n 1 − ⋯ − n r − 1 n r = n ! n 1 ! ( n − n 1 ) ! ( n − n 1 ) ! n 2 ! ( n − n 1 − n 2 ) ! ⋯ ( n − n 1 − ⋯ − n r − 1 ) ! n r ! ( n − n 1 − ⋯ − n r − 1 − n r ) ! = n ! n 1 ! n 2 ! ⋯ n r ! C_{n}^{n_1}C_{n-n_1}^{n_2} \cdots C_{n-n_1-\cdots-n_{r-1}}^{n_r} \\= \frac{n!}{n_1!(n-n_1)!}\frac{(n-n_1)!}{n_2!(n - n_1 - n_2)!} \cdots \frac{(n-n_1-\cdots-n_{r-1})!}{n_r!(n-n_1-\cdots-n_{r-1}-n_r)!} \\ = \frac{n!}{n_1!n_2!\cdots n_r!} Cnn1Cn−n1n2⋯Cn−n1−⋯−nr−1nr=n1!(n−n1)!n!n2!(n−n1−n2)!(n−n1)!⋯nr!(n−n1−⋯−nr−1−nr)!(n−n1−⋯−nr−1)!=n1!n2!⋯nr!n!
称最后这个值为多项式系数,它是对组合数的一种推广,组合数可以看成是多项式系数 r = 2 r=2 r=2的情形。
下面我们推导多项式定理,考虑 ( x 1 + x 2 + ⋯ + x r ) n (x_1+x_2+\cdots + x_r)^n (x1+x2+⋯+xr)n的展开式,因为是 n n n个 ( x 1 + x 2 + ⋯ + x r ) (x_1+x_2+\cdots + x_r) (x1+x2+⋯+xr)相乘,所以相当于要从 n n n个 ( x 1 , x 2 , ⋯ , x r ) (x_1,x_2,\cdots ,x_r) (x1,x2,⋯,xr)中各选一个元素出来相乘,并把所有相乘的结果相加。假设选出来的 x 1 , x 2 , ⋯ , x r x_1,x_2,\cdots ,x_r x1,x2,⋯,xr个数分别为 n 1 , ⋯ , n r n_1,\cdots,n_r n1,⋯,nr个,则 n 1 + n 2 + ⋯ + n r = n n_1+n_2+\cdots + n_r = n n1+n2+⋯+nr=n
根据多项式系数,这种选法一共有 n ! n 1 ! n 2 ! ⋯ n r ! \frac{n!}{n_1!n_2!\cdots n_r!} n1!n2!⋯nr!n!种,因此展开式为 ( x 1 + x 2 + ⋯ + x r ) n = ∑ n 1 + n 2 + ⋯ + n r = n n ! n 1 ! n 2 ! ⋯ n r ! x 1 n 1 x 2 n 2 ⋯ x r n r (x_1+x_2+\cdots + x_r)^n = \sum_{n_1+n_2+\cdots + n_r = n}\frac{n!}{n_1!n_2!\cdots n_r!}x_1^{n_1}x_2^{n_2}\cdots x_r^{n_r} (x1+x2+⋯+xr)n=n1+n2+⋯+nr=n∑n1!n2!⋯nr!n!x1n1x2n2⋯xrnr
这就是著名的多项式定理。
在古典概型的含义下,假设 N = x 1 + x 2 + ⋯ + x r N = x_1+x_2+\cdots + x_r N=x1+x2+⋯+xr,记 p i = x i / N , i = 1 , 2 , ⋯ , r p_i = x_i/N,i=1,2,\cdots,r pi=xi/N,i=1,2,⋯,r,则多项式定理可以写成 ( p 1 + p 2 + ⋯ + p r ) n = ∑ n 1 + n 2 + ⋯ + n r = n n ! n 1 ! n 2 ! ⋯ n r ! p 1 n 1 p 2 n 2 ⋯ p r n r (p_1+p_2+\cdots + p_r)^n = \sum_{n_1+n_2+\cdots + n_r = n}\frac{n!}{n_1!n_2!\cdots n_r!}p_1^{n_1}p_2^{n_2}\cdots p_r^{n_r} (p1+p2+⋯+pr)n=n1+n2+⋯+nr=n∑n1!n2!⋯nr!n!p1n1p2n2⋯prnr
记 X 1 , ⋯ , X r X_1,\cdots,X_r X1,⋯,Xr是一组随机变量,如果 P ( X 1 = n 1 , ⋯ , X r = n r ) = n ! n 1 ! n 2 ! ⋯ n r ! p 1 n 1 p 2 n 2 ⋯ p r n r , n 1 + n 2 + ⋯ + n r = n P(X_1 = n_1,\cdots,X_r = n_r) = \frac{n!}{n_1!n_2!\cdots n_r!}p_1^{n_1}p_2^{n_2}\cdots p_r^{n_r},n_1+n_2+\cdots + n_r = n P(X1=n1,⋯,Xr=nr)=n1!n2!⋯nr!n!p1n1p2n2⋯prnr,n1+n2+⋯+nr=n
就称 X 1 , ⋯ , X r X_1,\cdots,X_r X1,⋯,Xr是服从多项分布 M u l t i N o m ( n , p 1 , ⋯ , p r ) MultiNom(n,p_1,\cdots,p_r) MultiNom(n,p1,⋯,pr)的随机变量,如果 p 1 + p 2 + ⋯ + p r = 1 p_1+p_2+\cdots + p_r = 1 p1+p2+⋯+pr=1
这两个性质是多项分布比较独特的地方,第一个性质说明将多项分布的某几个变量合并过后剩下的变量还是服从多项分布;第二个性质说明多项分布的条件分布仍然是多项分布。下面证明这两个性质。
证明 先看性质一,首先 p 1 + ⋯ + p i + p i + 1 ∗ = 1 p_1+\cdots + p_i + p_{i+1}^* = 1 p1+⋯+pi+pi+1∗=1,接下来计算概率 P ( X 1 = n 1 , ⋯ , X i = n i , Y i + 1 = n i + 1 ∗ ) = P ( X 1 = n 1 , ⋯ , X i = n i , X i + 1 + ⋯ X r = n i + 1 ∗ ) P(X_1 = n_1,\cdots,X_i = n_i,Y_{i+1} = n_{i+1}^*) \\= P(X_1 = n_1,\cdots,X_i = n_i,X_{i+1} + \cdots X_r = n_{i+1}^*) P(X1=n1,⋯,Xi=ni,Yi+1=ni+1∗)=P(X1=n1,⋯,Xi=ni,Xi+1+⋯Xr=ni+1∗)
X 1 = n 1 , ⋯ , X i = n i , X i + 1 + ⋯ X r = n i + 1 ∗ X_1 = n_1,\cdots,X_i = n_i,X_{i+1} + \cdots X_r = n_{i+1}^* X1=n1,⋯,Xi=ni,Xi+1+⋯Xr=ni+1∗的组合方式一共有 n ! n 1 ! ⋯ n i ! n i + 1 ∗ ! \frac{n!}{n_1!\cdots n_i! n_{i+1}^*!} n1!⋯ni!ni+1∗!n!
X 1 = n 1 , ⋯ , X i = n i X_1 = n_1,\cdots,X_i = n_i X1=n1,⋯,Xi=ni对应的概率是 p 1 n 1 ⋯ p i n i p_1^{n_1}\cdots p_i^{n_i} p1n1⋯pini,按上面多项分布的推导, X i + 1 + ⋯ X r = n i + 1 ∗ X_{i+1} + \cdots X_r = n_{i+1}^* Xi+1+⋯Xr=ni+1∗对应的概率是 ∑ l i + 1 + ⋯ + l r = n i + 1 ∗ n i + 1 ∗ ! l i + 1 ! ⋯ l r ! p i + 1 l i + 1 ⋯ p r l r = ( p i + 1 + ⋯ + p r ) n i + 1 ∗ = ( p i + 1 ∗ ) n i + 1 ∗ \sum_{l_{i+1} + \cdots + l_r = n_{i+1}^*} \frac{n_{i+1}^*!}{l_{i+1}! \cdots l_r!} p_{i+1}^{l_{i+1}} \cdots p_{r}^{l_r} = (p_{i+1} + \cdots + p_r)^{n_{i+1}^*} = (p_{i+1}^*)^{n_{i+1}^*} li+1+⋯+lr=ni+1∗∑li+1!⋯lr!ni+1∗!pi+1li+1⋯prlr=(pi+1+⋯+pr)ni+1∗=(pi+1∗)ni+1∗
所以 P ( X 1 = n 1 , ⋯ , X i = n i , Y i + 1 = n i + 1 ∗ ) = n ! n 1 ! ⋯ n i ! n i + 1 ∗ ! p 1 n 1 ⋯ p i n i ( p i + 1 ∗ ) n i + 1 ∗ P(X_1 = n_1,\cdots,X_i = n_i,Y_{i+1} = n_{i+1}^*) = \frac{n!}{n_1!\cdots n_i! n_{i+1}^*!}p_1^{n_1}\cdots p_i^{n_i}(p_{i+1}^*)^{n_{i+1}^*} P(X1=n1,⋯,Xi=ni,Yi+1=ni+1∗)=n1!⋯ni!ni+1∗!n!p1n1⋯pini(pi+1∗)ni+1∗ 下面讨论性质二的证明。先计算联合概率, P ( X 1 = l 1 , ⋯ , X t = l t , X t + 1 = m t + 1 , ⋯ , X r = m r ) = n ! l 1 ! ⋯ l t ! m t + 1 ! ⋯ m r ! p 1 l 1 ⋯ p t l t p t + 1 m t + 1 ⋯ p r m r P(X_1 = l_1,\cdots,X_t = l_t,X_{t+1} = m_{t+1},\cdots,X_r = m_r) \\ = \frac{n!}{l_1!\cdots l_t!m_{t+1}!\cdots m_r!}p_1^{l_1}\cdots p_t^{l_t}p_{t+1}^{m_{t+1}}\cdots p_r^{m_r} P(X1=l1,⋯,Xt=lt,Xt+1=mt+1,⋯,Xr=mr)=l1!⋯lt!mt+1!⋯mr!n!p1l1⋯ptltpt+1mt+1⋯prmr
以及边缘概率 P ( X t + 1 = m t + 1 , ⋯ , X r = m r ) = ∑ l 1 + ⋯ l t = n − m n ! l 1 ! ⋯ l t ! m t + 1 ! ⋯ m r ! p 1 l 1 ⋯ p t l t p t + 1 m t + 1 ⋯ p r m r = n ! ( n − m ) ! m t + 1 ! ⋯ m r ! p t + 1 m t + 1 ⋯ p r m r ∑ l 1 + ⋯ l t = n − m ( n − m ) ! l 1 ! ⋯ l t ! p 1 l 1 ⋯ p t l t = n ! ( n − m ) ! m t + 1 ! ⋯ m r ! p t + 1 m t + 1 ⋯ p r m r ( p 1 + ⋯ p t ) n − m P(X_{t+1} = m_{t+1},\cdots,X_r = m_r) = \sum_{l_1+\cdots l_t = n-m}\frac{n!}{l_1!\cdots l_t!m_{t+1}!\cdots m_r!}p_1^{l_1}\cdots p_t^{l_t}p_{t+1}^{m_{t+1}}\cdots p_r^{m_r} \\ = \frac{n!}{(n-m)!m_{t+1}!\cdots m_r!}p_{t+1}^{m_{t+1}}\cdots p_r^{m_r} \sum_{l_1+\cdots l_t = n-m} \frac{(n-m)!}{l_1!\cdots l_t!}p_1^{l_1}\cdots p_t^{l_t} \\ = \frac{n!}{(n-m)!m_{t+1}!\cdots m_r!}p_{t+1}^{m_{t+1}}\cdots p_r^{m_r}(p_1+\cdots p_t)^{n-m} P(Xt+1=mt+1,⋯,Xr=mr)=l1+⋯lt=n−m∑l1!⋯lt!mt+1!⋯mr!n!p1l1⋯ptltpt+1mt+1⋯prmr=(n−m)!mt+1!⋯mr!n!pt+1mt+1⋯prmrl1+⋯lt=n−m∑l1!⋯lt!(n−m)!p1l1⋯ptlt=(n−m)!mt+1!⋯mr!n!pt+1mt+1⋯prmr(p1+⋯pt)n−m
因此条件概率为 P ( X 1 = l 1 , ⋯ , X t = l t ∣ X t + 1 = m t + 1 , ⋯ , X r = m r ) = P ( X 1 = n 1 , ⋯ , X i = n i , Y i + 1 = n i + 1 ∗ ) P ( X t + 1 = m t + 1 , ⋯ , X r = m r ) = n ! l 1 ! ⋯ l t ! m t + 1 ! ⋯ m r ! p 1 l 1 ⋯ p t l t p t + 1 m t + 1 ⋯ p r m r n ! ( n − m ) ! m t + 1 ! ⋯ m r ! p t + 1 m t + 1 ⋯ p r m r ( p 1 + ⋯ p t ) n − m = ( n − m ) ! l 1 ! ⋯ l t ! ( p 1 ∗ ) l 1 ⋯ ( p t ∗ ) l t P(X_1 = l_1,\cdots,X_t = l_t|X_{t+1} = m_{t+1},\cdots,X_r = m_r) \\ = \frac{P(X_1 = n_1,\cdots,X_i = n_i,Y_{i+1} = n_{i+1}^*) }{P(X_{t+1} = m_{t+1},\cdots,X_r = m_r) } \\ = \frac{\frac{n!}{l_1!\cdots l_t!m_{t+1}!\cdots m_r!}p_1^{l_1}\cdots p_t^{l_t}p_{t+1}^{m_{t+1}}\cdots p_r^{m_r}}{\frac{n!}{(n-m)!m_{t+1}!\cdots m_r!}p_{t+1}^{m_{t+1}}\cdots p_r^{m_r}(p_1+\cdots p_t)^{n-m}} \\ = \frac{(n-m)!}{l_1!\cdots l_t!}(p_1^*)^{l_1}\cdots (p_t^*)^{l_t} P(X1=l1,⋯,Xt=lt∣Xt+1=mt+1,⋯,Xr=mr)=P(Xt+1=mt+1,⋯,Xr=mr)P(X1=n1,⋯,Xi=ni,Yi+1=ni+1∗)=(n−m)!mt+1!⋯mr!n!pt+1mt+1⋯prmr(p1+⋯pt)n−ml1!⋯lt!mt+1!⋯mr!n!p1l1⋯ptltpt+1mt+1⋯prmr=l1!⋯lt!(n−m)!(p1∗)l1⋯(pt∗)lt
证毕