ใใใผใฟ่งฃๆใฏใพใ ใพใ ใใผใ ใฎใใใงใใๆฉๆขฐๅญฆ็ฟใงใใไฝใงใใใใพใใใผใฟใไธใใใใฆใใใใจใฏใใใผใฟใ่ฆ่ฆๅใใฆใ่งฃๆใฎใจใฃใใใใจใชใไปฎ่ชฌใใคใใใใจใงใใญใไปฎ่ชฌใๆค่จผใใใใใๆงใ ใช่งฃๆใใใฆๆงใ ใช็ตฑ่จ้ใ่จ็ฎใใพใใใ
ใใใใชใใจใใใฆใใใจใใใ็ตฑ่จ้ใฎๆฏ้ๅฃใซใใใใใฉใใญใ็ฅใใใใใจใใไบๆ ใซๅบใใใใใจใใใใพใใ
ใใใใใใผใฟใฎๅนณๅๅคใงใใใฐใๆจๆบ่ชคๅทฎ (standard error of the mean)ใจใใใใฎใ่จ็ฎใใใฐOKใงใใๆจๆบๅๅทฎใใใใผใฟๆฐใฎๅนณๆนๆ นใงๅฒใฃใใใฎใงใใญใ
ใใจใใใใ่งฃๆใงใฏๅนณๅใ ใใงใชใใไธญๅคฎๅค (median)ใ็พๅ็ (centile)ใ็ธ้ขไฟๆฐ (correlation coefficient)ใชใฉๆงใ ใช็ตฑ่จ้ใ่ฆใฆใใใใจใซใชใใพใใใใใฆใใใใใซใฏๆจๆบ่ชคๅทฎใไฟก้ ผๅบ้ (confidence interval)ใจใใฃใใใฉใใญใๅฎ้ๅใงใใ็ฐกๅใชๅ ฌๅผใๅญๅจใใพใใใใ
ใใใใชใจใใซๅฝนใซ็ซใคใฎใใๅๆจๆฌๅ (resampling)ใ็นใซใใใผใในใใฉใใ (Bootstrap) ใจใใๆๆณใงใใ
ใใผใในใใฉใใใจใฏใ
ใใใผใในใใฉใใใงใฏใพใใๅๆจๆฌๅ (resampling)ใซใใฃใฆๅ ใใผใฟใใไฝๅใใฎไผผใใใผใฟใปใใใไฝใๅบใใพใใใใใฆใใใใใใฎใใผใฟใปใใใงใๆฌฒใใ็ตฑ่จ้ใ่จ็ฎใใพใใใใใจใใใฎ็ตฑ่จ้ใฎๅๅธใใงใใใฎใงใใใฎๅๅธใฎๆจๆบๅๅทฎใ่จ็ฎใใใใจใซใใใใใฎ็ตฑ่จ้ใฎๆจๆบ่ชคๅทฎใๆฑใใใใจใใงใใพใใ
ใไพใจใใฆใ20ไบบใฎIQใฎใใผใฟใใใฃใใจใใพใใๅคใฏใใใใใ61, 88, 89, 89, 90, 92, 93, 94, 98, 98, 101, 102, 105, 108, 109, 113, 114, 115, 120, 138ใ ใจใใพใใใใๅๅธใฏไปฅไธใฎใใใชๆใใงใใ
ใใใฎใจใใๅนณๅๅค (mean) ใฏ100.85ใๅนณๅๅคใฎๆจๆบ่ชคๅทฎ (standard error of the mean) ใฏ3.45ใซใชใใพใใ
ใใใ ใใใใใๅคใฎๅๅธใฃใฆใๆๅพใจใใใใใงใใใฉใๅนณๅๅค (mean) ใใใไธญๅคฎๅค (median) ใฎๆนใไบบใฎๅฎๆใๅๆ ใใฆใใใงใใใญใใชใฎใงใไปๅใฏใใผใในใใฉใใใไฝฟใฃใฆใไธญๅคฎๅคใฎใใฉใใญใใใชใใกไธญๅคฎๅคใฎๆจๆบ่ชคๅทฎ (standard error of the median) ใๆฑใใฆใฟใใใจๆใใพใใ
ใใผใในใใฉใใใฎใขใซใดใชใบใ
ใใผใในใใฉใใใฏใไปฅไธใฎๆตใใง่กใใพใใ
1: ไฝๅๅๆจๆฌๅใใใๆฑบใใพใใ
ใ500, 1000, 10000, 100000ใชใฉใใญใชใ่ฏใใๅคงใใๆฐใซใใพใใใใไปๅใฏ1000ใงใใฃใฆใฟใพใใ
2: ๅๆจๆฌๅใใฆใไบบๅทฅ็ใซใใผใฟใ็ๆใใพใใใใฎใจใใ้่คใ่จฑใใฆๅ ใใผใฟใฎๆจๆฌๆฐใ ใใใใผใฟๆจๆฌใๅใๅบใใพใใ
ใไพใใฐใๅ
ใใผใฟใ
ๅๆจๆฌ1 =
ๅๆจๆฌ2 =
ๅๆจๆฌ3 =
......
ๅๆจๆฌ998 =
ๅๆจๆฌ999 =
ๅๆจๆฌ1000 =
ใใใใใฆใในใใใ๏ผใงๆฑบใใๆฐใ ใใ้่คใ่จฑใใใผใฟๆจๆฌใๅ ใใผใฟๆฐใ ใๅใๅบใใใจใ็นฐใ่ฟใใไบบๅทฅ็ใซใใผใฟใไฝใฃใฆใใพใใพใใ
ใๅๅๆจๆฌใซใฉใฎใใผใฟๆจๆฌใๅ ฅใใใฏใฉใณใใ ใงใใ5ๅใในใฆๅใๅคใๅ ฅใใจใใใใใใใใใพใใใใ5ๅๅ จใฆ้ใๅคใๅ ฅใใจใใใใใงใใใใ
3: ๅๅๆจๆฌใงใๆฑใใใ็ตฑ่จ้ใ่จ็ฎใใใ
ใไธญๅคฎๅคใชใใชใใชใใใใใใใฎๅๆจๆฌใใผใฟใไฝฟใฃใฆ็ตฑ่จ้ใๆฑใใพใใ1000ๅๅๆจๆฌๅใใใใ็ตฑ่จ้ใ1000ๅใจใใพใใญใ
ๅๆจๆฌ1 =
ๅๆจๆฌ2 =
ๅๆจๆฌ3 =
......
ๅๆจๆฌ998 =
ๅๆจๆฌ999 =
ๅๆจๆฌ1000 =
ใใใใใ็ตฑ่จ้ใใจใใพใใใญใใใใฆใใใ่ใใพใใ
ใใใฎใใใใใฎ็ตฑ่จ้ใฎๅๅธใฃใฆใใใฎ็ตฑ่จ้ใฎๆฏ้ๅฃๅๅธใจใปใผๅใใจ่ใใฆใใใใญ๏ผใ ใฃใฆใฉใใใใฃใจๅฎ้จใใฆใใผใฟใใใใใๅใฃใใจใใใงใไผผใใใใชๅคใฎใใผใฟใๅขใใใ ใใ ใใใญ๏ผ๏ผใ
ใใคใพใใใผใในใใฉใใใงใฏใๅฎ้จ็ใซใงใฏใชใใ็ตฑ่จใฎใใชใใฏใไฝฟใฃใฆใใผใฟๆฐใๅขใใใฆใใใฎใงใใญใใใใฆใใผใฟๆฐใใใ็จๅบฆๅคงใใใใฐใใใฎๅๅธใฏ้ใใชใๆฏ้ๅฃใฎๅๅธใซ่ฟใใฏใใ ใใจใใๆณๅฎใใใฆใใพใใ
ใ
4: ๆฑใใ็ตฑ่จ้ใฎๅๅธใไฝฟใฃใฆใใใฉใใญ๏ผๆจๆบ่ชคๅทฎใไฟก้ ผๅบ้ใชใฉ๏ผใ่จ็ฎใใใ
ใไพใใฐๆจๆบ่ชคๅทฎใฏใๆฏ้ๅฃๅๅธใฎๆจๆบๅๅทฎใฎใใจใชใฎใงใๅ็ดใซใๆฑใใใใ็ตฑ่จ้ๅๅธใฎๆจๆบๅๅทฎใๆฑใใใฐใใใงใใญใ
ใใใใใฆใใใงใใๆฐใซใชใ็ตฑ่จ้ใฎใใฉใใญใๆฑใใใใจใใงใใพใใ
ใใผใในใใฉใใใฎๅฎ่ทต
ใpythonใงใใฃใฆใฟใพใใใ (Matlab/Octaveใไธใซใใใพใ๏ผใ
ใใพใใๅฟ ่ฆใชใฉใคใใฉใชใใฒใใใใพใใ
# import librarys
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
ใNumpyใไฝฟใฃใฆใใใผใฟใใขใฌใคใซๅ ฅใใพใใ
# get data
iq = np.array([61, 88, 89, 89, 90, 92, 93,
94, 98, 98, 101, 102, 105, 108,
109, 113, 114, 115, 120, 138])
ใๅบๆฌ็ตฑ่จ้ใจใใฆใๅนณๅๅคใๅนณๅๅคใฎๆจๆบ่ชคๅทฎใไธญๅคฎๅคใ่จ็ฎใใพใใ
# compute mean, SEM (standard error of the mean) and median
mean_iq = np.average(iq)
sem_iq = np.std(iq)/np.sqrt(len(iq))
median_iq = np.median(iq)
ใใใใใใๅนณๅๅค: 100.85ใๅนณๅๅคใฎๆจๆบ่ชคๅทฎ: 3.45ใไธญๅคฎๅค: 99.5 ใจ่จ็ฎใงใใพใใใ
ใไธญๅคฎๅคใฎๆจๆบ่ชคๅทฎใๆฑใใใใใใใผใในใใฉใใใๅฎ่กใใพใใใใผใในใใฉใใใๆญฃใใใงใใฆใใใฎ sanity check ใจใใฆใไธๅฟๅนณๅๅคใฎๆจๆบ่ชคๅทฎใไธ็ทใซ่จ็ฎใใฆใใพใใ
ใpythonใงใ1~nใฎ้ฃ็ถใใๆดๆฐๅคใ้่คใใใงใฉใณใใ ใซไธฆใณๆฟใใใซใฏใไปฅไธใฎใใใซใใพใใ
np.random.choice(n,n,replace=True)
ใใใใไฝฟใฃใฆใbootstrapใฎๅๆจๆฌๅใ่กใใพใใ
# bootstrap to compute sem of the median
def bootstrap(x,repeats):
# placeholder (column1: mean, column2: median)
vec = np.zeros((2,repeats))
for i in np.arange(repeats):
# resample data with replacement
re = np.random.choice(len(x),len(x),replace=True)
re_x = x[re]
# compute mean and median of the "new" dataset
vec[0,i] = np.mean(re_x)
vec[1,i] = np.median(re_x)
# histogram of median from resampled datasets
sns.distplot(vec[1,:], kde=False)
# compute bootstrapped standard error of the mean,
# and standard error of the median
b_mean_sem = np.std(vec[0,:])
b_median_sem = np.std(vec[1,:])
return b_mean_sem, b_median_sem
ใใใฎ้ขๆฐใ x = iq, repeats = 1000 ใจใใฆๅฎ่กใใใจใใใผใในใใฉใใใซใใฃใฆๆฑใใใใ1000ๅใฎไธญๅคฎๅคใฎๅๅธใใฟใใใจใใงใใพใใ
ใโฆโฆใใฃใจ็นฐใ่ฟใใๅขใใใๆนใใใใฃใใใใงใใญใ
ใใจใใใใใใใฎๅๅธใใไธญๅคฎๅคใฎๆจๆบ่ชคๅทฎใๆฑใใใใจใใงใใพใใๅใซ1000ๅใฎไธญๅคฎๅคใฎๆจๆบๅๅทฎใ่จ็ฎใใใ ใใงใใ
- ใใผใในใใฉใใใงๆฑใใใใไธญๅคฎๅคใฎๆจๆบ่ชคๅทฎ: 4.22
ใๅใใใใซใๅนณๅๅคใฎๆจๆบ่ชคๅทฎใ่จ็ฎใใพใใ
- ใใผใในใใฉใใใงๆฑใใใใๅนณๅๅคใฎๆจๆบ่ชคๅทฎ: 3.42
ใๆๅใซใๅ ใใผใฟใใๆฑใใๅนณๅๅคใฎๆจๆบ่ชคๅทฎใฏ3.45ใชใฎใงใ็ตๆง่ฟใใงใใญใใใผใในใใฉใใใฏๆญฃใใ็ตฑ่จ้ใฎใใฉใใญใๆจๅฎใใใใใงใใ
ๆณจๆ็น
ใใใผใในใใฉใใใงใฏใฉใณใใ ใซๅๆจๆฌๅใใใใใๅฎ่กใใใใณใซ็ตๆใๅฐใใใค็ฐใชใใพใใ็ตๆใฎใใฉใใญใๆธใใใใใๆญฃ็ขบใชๆจๅฎใใใใซใฏใ็นฐใ่ฟใๆฐใๅขใใ (~10,000) ๅฟ
่ฆใใใใพใใ
ใพใจใใฎใพใจใ
ใใใผใในใใฉใใใงใฏๅๆจๆฌๅใซใใใ่ๅณใฎใใ็ตฑ่จ้ใฎใใฉใใญใๆจๅฎใใใใจใใงใใใ
็ตใใใซ
ใใใผใในใใฉใใใใฏใใใ็ตฑ่จๅญฆใฎไธญใงใๅๆจๆฌๅ (resampling) ใ่ใใไบบใฏๅคฉๆใ ใจๆใใพใใใใผใฟใไบบๅทฅ็ใซๅขใใใฆๆฏ้ๅฃๅๅธใ่ฟไผผใใฆใใพใใชใใฆใใจใใใใใใผใฟใใฒใใใ้ใ็ถใใ่ฆ่กใใ่งฃๆพใใใใใใงใใญ๏ผใใกใใใใใ็จๅบฆใใผใฟใๅฎ้ใซใจใฃใฆๆฏ้ๅฃใฎไปฃ่กจไพใ้ใใชใใใฐใใใใๅๆจๆฌๅใใฆใๆฐธ้ ใซๆฏ้ๅฃใซใฏ่ฟใฅใใพใใใโฆโฆใ
ใใฝใผในใณใผใใฏไปฅไธใซ่ผใใฆใใใพใใ
# import librarys
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# get data
iq = np.array([61, 88, 89, 89, 90, 92, 93,
94, 98, 98, 101, 102, 105, 108,
109, 113, 114, 115, 120, 138])
# compute mean, SEM (standard error of the mean) and median
mean_iq = np.average(iq)
sem_iq = np.std(iq)/np.sqrt(len(iq))
median_iq = np.median(iq)
# bootstrap to compute sem of the median
def bootstrap(x,repeats):
# placeholder (column1: mean, column2: median)
vec = np.zeros((2,repeats))
for i in np.arange(repeats):
# resample data with replacement
re = np.random.choice(len(x),len(x),replace=True)
re_x = x[re]
# compute mean and median of the "new" dataset
vec[0,i] = np.mean(re_x)
vec[1,i] = np.median(re_x)
# histogram of median from resampled datasets
sns.distplot(vec[1,:], kde=False)
# compute bootstrapped standard error of the mean,
# and standard error of the median
b_mean_sem = np.std(vec[0,:])
b_median_sem = np.std(vec[1,:])
return b_mean_sem, b_median_sem
# execute bootstrap
bootstrapped_sem = bootstrap(iq,1000)
Matlab/Ovctaveใฏไปฅไธใงใใ
function bootstrap_demo
% data
iq = [61, 88, 89, 89, 90, 92, 93,94, 98, 98, 101, 102, 105, 108,109, 113, 114, 115, 120, 138];
% compute mean, SEM (standard error of the mean) and median
mean_iq = mean(iq);
sem_iq = std(iq)/sqrt(length(iq));
median_iq = median(iq);
disp(['the mean: ' num2str(mean_iq)])
disp(['the SE of the mean: ' num2str(sem_iq)])
disp(['the median: ' num2str(median_iq)])
disp('---------------------------------')
[b_mean_sem, b_median_sem] = bootstrap(iq, 1000);
disp(['bootstrapped SE of the mean: ' num2str(b_mean_sem)])
disp(['bootstrapped SE of the median: ' num2str(b_median_sem)])
% bootstrap to compute sem of the median
function [b_mean_sem, b_median_sem] = bootstrap(x, repeats)
% placeholder (column1: mean, column2: median)
vec = zeros(2,repeats);
for i = 1:repeats
% resample data with replacement
re_x = x(datasample(1:length(x),length(x),'Replace',True));
% compute mean and median of the "new" dataset
vec(1,i) = mean(re_x);
vec(2,i) = median(re_x);
end
% histogram of median from resampled dataset
histogram(vec(2,:))
% compute bootstrapped standard error of the mean, and standard error of
% the median
b_mean_sem = std(vec(1,:));
b_median_sem = std(vec(2,:));


Comments
Let's comment your feelings that are more than good