设样本 ( X 1 , . . . , X n 1 ) (X_1, ..., X_{n1}) (X1,...,Xn1)和 ( Y 1 , . . . , Y n 2 ) (Y_1,...,Y_{n2}) (Y1,...,Yn2)分别来自总体 N ( μ 1 , σ 1 2 ) N(\mu_1, \sigma1^2) N(μ1,σ12)和 N ( μ 2 , σ 2 2 ) N(\mu_2, \sigma_2^2) N(μ2,σ22),并且它们相互独立. 样本均值分别为 X ‾ , Y ‾ \overline X, \overline Y X,Y; 样本方差分别为 S 1 2 , S 2 2 S_1^2, S_2^2 S12,S22. 置信水平为 1 − α 1-\alpha 1−α.
由 μ 1 − μ 2 \mu_1 - \mu_2 μ1−μ2的估计是 X ‾ − Y ‾ \overline X - \overline Y X−Y的分布,得枢轴量: ( x ‾ − y ‾ ) − ( μ 1 − μ 2 ) σ 1 2 n 1 + σ 2 2 n 2 ∼ N ( 0 , 1 ) \frac{(\overline x - \overline y)-(\mu_1-\mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}\sim N(0,1) n1σ12+n2σ22 (x−y)−(μ1−μ2)∼N(0,1) 得其置信区间为: ( X ‾ − Y ‾ ) ± Z α / 2 σ 1 2 n 1 + σ 2 2 n 2 (\overline X - \overline Y) \pm Z_{\alpha/2}\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}} (X−Y)±Zα/2n1σ12+n2σ22
以 S w 2 = ( n 1 − 1 ) S 1 2 + ( n 2 − 1 ) S 2 2 n 1 + n 2 − 2 S_w^2=\frac{(n_1-1)S_1^2 +(n_2-1)S_2^2}{n_1+n_2-2} Sw2=n1+n2−2(n1−1)S12+(n2−1)S22代替 σ 2 \sigma^2 σ2得到枢轴量: ( X ‾ − Y ‾ ) − ( μ 1 − μ 2 ) S w 1 n 1 + 1 n 2 ∼ t ( n 1 + n 2 − 2 ) \frac{(\overline X - \overline Y)-(\mu_1 - \mu_2)}{S_w\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\sim t(n_1+n_2 -2) Swn11+n21 (X−Y)−(μ1−μ2)∼t(n1+n2−2) 得其置信区间为: ( X ‾ − Y ‾ ) ± t α / 2 ( n 1 + n 2 − 2 ) S w 1 n 1 + 1 n 2 (\overline X - \overline Y)\pm t_{\alpha/2}(n_1+n_2 -2)S_w\sqrt{\frac{1}{n_1}+\frac{1}{n_2}} (X−Y)±tα/2(n1+n2−2)Swn11+n21
以 S 1 2 S_1^2 S12估计 σ 1 2 \sigma_1^2 σ12, 以 S 2 2 估 计 σ 2 2 S_2^2估计\sigma_2^2 S22估计σ22 当样本量 n 1 n_1 n1和 n 2 n_2 n2都充分大时(一般要>30), ( X ‾ − Y ‾ ) − ( μ 1 − μ 2 ) S 1 2 n 1 + S 1 2 n 2 ∼ N ( 0 , 1 ) \frac{(\overline X - \overline Y)-(\mu_1 - \mu_2)}{\sqrt{\frac{S_1^2}{n_1}+\frac{S_1^2}{n_2}}}\sim N(0,1) n1S12+n2S12 (X−Y)−(μ1−μ2)∼N(0,1) 得其近似置信区间: ( X ‾ − Y ‾ ) ± Z α / 2 S 1 2 n 1 + S 2 2 n 2 (\overline X - \overline Y)\pm Z_{\alpha/2}\sqrt{\frac{S_1^2}{n_1}+\frac{S_2^2}{n_2}} (X−Y)±Zα/2n1S12+n2S22 当样本量很小的时 ( X ‾ − Y ‾ ) − ( μ 1 − μ 2 ) S 1 2 n 1 + S 2 2 n 2 ∼ t ( k ) \frac{(\overline X - \overline Y)-(\mu_1-\mu_2)}{\sqrt{\frac{S_1^2}{n_1}+\frac{S_2^2}{n_2}}}\sim t(k) n1S12+n2S22 (X−Y)−(μ1−μ2)∼t(k) 其中 k ≈ m i n ( n 1 − 1 , n 2 − 1 ) k \approx min(n_1-1, n_2-1) k≈min(n1−1,n2−1) 则其近似置信区间为: ( X ‾ − Y ‾ ) ± t α / 2 ( k ) S 1 2 n 1 + S 2 2 n 2 (\overline X - \overline Y) \pm t_{\alpha/2}(k)\sqrt{\frac{S_1^2}{n_1}+\frac{S_2^2}{n_2}} (X−Y)±tα/2(k)n1S12+n2S22
由 σ 1 2 σ 2 2 \frac{\sigma_1^2}{\sigma_2^2} σ22σ12的估计 S 1 2 S 2 2 \frac{S_1^2}{S_2^2} S22S12得到枢轴量: S 1 1 / S 2 2 σ 1 2 / σ 2 2 ∼ F ( n 1 − 1 , n 2 − 1 ) \frac{S_1^1/S_2^2}{\sigma_1^2/\sigma_2^2}\sim F(n_1-1, n_2-1) σ12/σ22S11/S22∼F(n1−1,n2−1) 得其置信区间为: S 1 2 S 2 2 1 F α / 2 ( n 1 − 1 , n 2 − 1 ) , S 1 2 S 2 2 1 F 1 − α / 2 ( n 1 − 1 , n 2 − 1 ) \frac{S_1^2}{S_2^2}\frac{1}{F_{\alpha/2}(n_1-1, n_2-1)}, \frac{S_1^2}{S_2^2}\frac{1}{F_{1-\alpha/2}(n_1-1, n_2-1)} S22S12Fα/2(n1−1,n2−1)1,S22S12F1−α/2(n1−1,n2−1)1
例: 两台机床生产同一型号滚珠,从甲机床生产的滚珠中取8个,从乙机床生产的滚珠中取9个,测得这些滚珠的直径(单位:毫米)如下: 甲机床:15.0, 14.8, 15.2, 15.4, 14.9, 15.1, 15.2, 14.8 乙机床:15.2, 15.0, 14.8, 15.1, 14.6, 14.8, 15.1, 14.5, 15.0 设两机床生产的滚珠直径分别为X, Y, 且 X ∼ N ( μ 1 , σ 1 2 ) , Y ∼ N ( μ 2 , σ 2 2 ) X\sim N(\mu_1, \sigma_1^2), Y\sim N(\mu_2, \sigma_2^2) X∼N(μ1,σ12),Y∼N(μ2,σ22) 求置信水平为0.9的双侧置信区间: (1) σ 1 = 0.8 , σ 2 = 0.24 , \sigma_1=0.8, \sigma_2=0.24, σ1=0.8,σ2=0.24,求 μ 1 − μ 2 \mu_1 - \mu_2 μ1−μ2的置信区间; (2) 若 σ 1 = σ 2 \sigma_1=\sigma_2 σ1=σ2且未知,求 μ 1 − μ 2 \mu_1 - \mu_2 μ1−μ2的置信区间; (3) 若 σ 1 ≠ σ 2 \sigma_1 \neq \sigma_2 σ1=σ2, 求 μ 1 − μ 2 \mu_1 - \mu_2 μ1−μ2的置信区间; (4) 若 μ 1 , μ 2 \mu_1, \mu_2 μ1,μ2未知, 求 σ 1 2 σ 2 2 \frac{\sigma_1^2}{\sigma_2^2} σ22σ12的置信区间. 解:(1)
data1 = np.array([15.0, 14.8, 15.2, 15.4, 14.9, 15.1, 15.2, 14.8]) data2 = np.array([15.2, 15.0, 14.8, 15.1, 14.6, 14.8, 15.1, 14.5, 15.0]) confidence_interval_udif(data1, data2, 0.18, 0.24, 0.1) # 结果: (-0.018145559249408555, 0.31814555924941279)(2)
data1 = np.array([15.0, 14.8, 15.2, 15.4, 14.9, 15.1, 15.2, 14.8]) data2 = np.array([15.2, 15.0, 14.8, 15.1, 14.6, 14.8, 15.1, 14.5, 15.0]) confidence_interval_udif(data1, data2, -1, -1, 0.1) # 结果: (-0.044246980022314808, 0.34424698002231907)(3)
data1 = np.array([15.0, 14.8, 15.2, 15.4, 14.9, 15.1, 15.2, 14.8]) data2 = np.array([15.2, 15.0, 14.8, 15.1, 14.6, 14.8, 15.1, 14.5, 15.0]) confidence_interval_udif(data1, data2, -1, -2, 0.1) # 结果: (-0.058430983560407906, 0.35843098356041214)(4)
data1 = np.array([15.0, 14.8, 15.2, 15.4, 14.9, 15.1, 15.2, 14.8]) data2 = np.array([15.2, 15.0, 14.8, 15.1, 14.6, 14.8, 15.1, 14.5, 15.0]) confidence_interval_varRatio(data1, data2,alpha=0.1) # 结果: (0.22712162982480297, 2.9620673328677332)