Why not login to Qiita and try out its useful features?

We'll deliver articles that match you.

You can read useful information later.

60
49

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

่‰ฒใ‚“ใช็ตฑ่จˆ้‡ใฎใƒใƒฉใƒ„ใ‚ญใ‚’ๆฑ‚ใ‚ใŸใ„๏ผๅ†ๆจ™ๆœฌๅŒ– (Bootstrap)ใฎใ‚นใ‚นใƒก

Last updated at Posted at 2017-02-06

ใ€€ใƒ‡ใƒผใ‚ฟ่งฃๆžใฏใพใ ใพใ ใƒ–ใƒผใƒ ใฎใ‚ˆใ†ใงใ™ใ€‚ๆฉŸๆขฐๅญฆ็ฟ’ใงใ‚ใ‚Œไฝ•ใงใ‚ใ‚Œใ€ใพใšใƒ‡ใƒผใ‚ฟใŒไธŽใˆใ‚‰ใ‚Œใฆใ™ใ‚‹ใ“ใจใฏใ€ใƒ‡ใƒผใ‚ฟใ‚’่ฆ–่ฆšๅŒ–ใ—ใฆใ€่งฃๆžใฎใจใฃใ‹ใ‹ใ‚Šใจใชใ‚‹ไปฎ่ชฌใ‚’ใคใ‹ใ‚€ใ“ใจใงใ™ใญใ€‚ไปฎ่ชฌใ‚’ๆคœ่จผใ™ใ‚‹ใŸใ‚ใ€ๆง˜ใ€…ใช่งฃๆžใ‚’ใ—ใฆๆง˜ใ€…ใช็ตฑ่จˆ้‡ใ‚’่จˆ็ฎ—ใ—ใพใ™ใ€‚ใ€€

ใ€€ใใ‚“ใชใ“ใจใ‚’ใ—ใฆใ„ใ‚‹ใจใ€ใ‚ใ‚‹็ตฑ่จˆ้‡ใฎๆฏ้›†ๅ›ฃใซใŠใ‘ใ‚‹ใƒใƒฉใƒ„ใ‚ญใŒ็Ÿฅใ‚ŠใŸใ„ใ€ใจใ„ใ†ไบ‹ๆ…‹ใซๅ‡บใใ‚ใ™ใ“ใจใŒใ‚ใ‚Šใพใ™ใ€‚

ใ€€ใใ‚ŒใŒใƒ‡ใƒผใ‚ฟใฎๅนณๅ‡ๅ€คใงใ‚ใ‚Œใฐใ€ๆจ™ๆบ–่ชคๅทฎ (standard error of the mean)ใจใ„ใ†ใ‚‚ใฎใ‚’่จˆ็ฎ—ใ™ใ‚ŒใฐOKใงใ™ใ€‚ๆจ™ๆบ–ๅๅทฎใ‚’ใ€ใƒ‡ใƒผใ‚ฟๆ•ฐใฎๅนณๆ–นๆ นใงๅ‰ฒใฃใŸใ‚‚ใฎใงใ™ใญใ€‚

ใ€€ใจใ“ใ‚ใŒใ€่งฃๆžใงใฏๅนณๅ‡ใ ใ‘ใงใชใใ€ไธญๅคฎๅ€ค (median)ใ€็™พๅˆ†็އ (centile)ใ€็›ธ้–ขไฟ‚ๆ•ฐ (correlation coefficient)ใชใฉๆง˜ใ€…ใช็ตฑ่จˆ้‡ใ‚’่ฆ‹ใฆใ„ใใ“ใจใซใชใ‚Šใพใ™ใ€‚ใใ—ใฆใ€ใ“ใ‚Œใ‚‰ใซใฏๆจ™ๆบ–่ชคๅทฎใ‚„ไฟก้ ผๅŒบ้–“ (confidence interval)ใจใ„ใฃใŸใƒใƒฉใƒ„ใ‚ญใ‚’ๅฎš้‡ๅŒ–ใงใใ‚‹็ฐกๅ˜ใชๅ…ฌๅผใŒๅญ˜ๅœจใ—ใพใ›ใ‚“ใ€‚ใ€€

ใ€€ใใ‚“ใชใจใใซๅฝนใซ็ซ‹ใคใฎใŒใ€ๅ†ๆจ™ๆœฌๅŒ– (resampling)ใ€็‰นใซใ€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ— (Bootstrap) ใจใ„ใ†ๆ‰‹ๆณ•ใงใ™ใ€‚

ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใจใฏใ€€

ใ€€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใงใฏใพใšใ€ๅ†ๆจ™ๆœฌๅŒ– (resampling)ใซใ‚ˆใฃใฆๅ…ƒใƒ‡ใƒผใ‚ฟใ‹ใ‚‰ไฝ•ๅƒใ‚‚ใฎไผผใŸใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใ‚’ไฝœใ‚Šๅ‡บใ—ใพใ™ใ€‚ใใ—ใฆใ€ใใ‚Œใžใ‚Œใฎใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใงใ€ๆฌฒใ—ใ„็ตฑ่จˆ้‡ใ‚’่จˆ็ฎ—ใ—ใพใ™ใ€‚ใ™ใ‚‹ใจใ€ใใฎ็ตฑ่จˆ้‡ใฎๅˆ†ๅธƒใŒใงใใ‚‹ใฎใงใ€ใใฎๅˆ†ๅธƒใฎๆจ™ๆบ–ๅๅทฎใ‚’่จˆ็ฎ—ใ™ใ‚‹ใ“ใจใซใ‚ˆใ‚Šใ€ใใฎ็ตฑ่จˆ้‡ใฎๆจ™ๆบ–่ชคๅทฎใ‚’ๆฑ‚ใ‚ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚

ใ€€ไพ‹ใจใ—ใฆใ€20ไบบใฎIQใฎใƒ‡ใƒผใ‚ฟใŒใ‚ใฃใŸใจใ—ใพใ™ใ€‚ๅ€คใฏใใ‚Œใžใ‚Œใ€61, 88, 89, 89, 90, 92, 93, 94, 98, 98, 101, 102, 105, 108, 109, 113, 114, 115, 120, 138ใ ใจใ—ใพใ—ใ‚‡ใ†ใ€‚ๅˆ†ๅธƒใฏไปฅไธ‹ใฎใ‚ˆใ†ใชๆ„Ÿใ˜ใงใ™ใ€‚

figure1.png

ใ€€ใ“ใฎใจใใ€ๅนณๅ‡ๅ€ค (mean) ใฏ100.85ใ€ๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎ (standard error of the mean) ใฏ3.45ใซใชใ‚Šใพใ™ใ€‚

ใ€€ใŸใ ใ€ใ“ใ†ใ—ใŸๅ€คใฎๅˆ†ๅธƒใฃใฆใ€ๆ‰€ๅพ—ใจใ‹ใ‚‚ใใ†ใงใ™ใ‘ใฉใ€ๅนณๅ‡ๅ€ค (mean) ใ‚ˆใ‚Šใ‚‚ไธญๅคฎๅ€ค (median) ใฎๆ–นใŒไบบใฎๅฎŸๆ„Ÿใ‚’ๅๆ˜ ใ—ใฆใ‚‹ใ‚“ใงใ™ใ‚ˆใญใ€‚ใชใฎใงใ€ไปŠๅ›žใฏใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใ‚’ไฝฟใฃใฆใ€ไธญๅคฎๅ€คใฎใƒใƒฉใƒ„ใ‚ญใ€ใ™ใชใ‚ใกไธญๅคฎๅ€คใฎๆจ™ๆบ–่ชคๅทฎ (standard error of the median) ใ‚’ๆฑ‚ใ‚ใฆใฟใŸใ„ใจๆ€ใ„ใพใ™ใ€‚

ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใฎใ‚ขใƒซใ‚ดใƒชใ‚บใƒ 

ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใฏใ€ไปฅไธ‹ใฎๆตใ‚Œใง่กŒใ„ใพใ™ใ€‚

1: ไฝ•ๅ›žๅ†ๆจ™ๆœฌๅŒ–ใ™ใ‚‹ใ‹ๆฑบใ‚ใพใ™ใ€‚
ใ€€500, 1000, 10000, 100000ใชใฉใ€ใ‚ญใƒชใŒ่‰ฏใใ€ๅคงใใ„ๆ•ฐใซใ—ใพใ—ใ‚‡ใ†ใ€‚ไปŠๅ›žใฏ1000ใงใ‚„ใฃใฆใฟใพใ™ใ€‚

2: ๅ†ๆจ™ๆœฌๅŒ–ใ—ใฆใ€ไบบๅทฅ็š„ใซใƒ‡ใƒผใ‚ฟใ‚’็”Ÿๆˆใ—ใพใ™ใ€‚ใ“ใฎใจใใ€้‡่ค‡ใ‚’่จฑใ—ใฆๅ…ƒใƒ‡ใƒผใ‚ฟใฎๆจ™ๆœฌๆ•ฐใ ใ‘ใ€ใƒ‡ใƒผใ‚ฟๆจ™ๆœฌใ‚’ๅ–ใ‚Šๅ‡บใ—ใพใ™ใ€‚

ใ€€ไพ‹ใˆใฐใ€ๅ…ƒใƒ‡ใƒผใ‚ฟใŒX1,X2,...,X5ใจ5ๅ€‹ใ‚ใฃใŸใจใ—ใŸใ‚‰ใ€ๅ†ๆจ™ๆœฌๅŒ–ใ™ใ‚‹ใŸใณใซ็”Ÿๆˆใ•ใ‚Œใ‚‹ใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใฎไพ‹ใฏไปฅไธ‹ใฎใ‚ˆใ†ใซใชใ‚Šใพใ™ใ€‚

ๅ†ๆจ™ๆœฌ1 = X1,X2,X3,X3,X4
ๅ†ๆจ™ๆœฌ2 = X1,X1,X3,X4,X4
ๅ†ๆจ™ๆœฌ3 = X2,X2,X5,X5,X5
......
ๅ†ๆจ™ๆœฌ998 = X1,X2,X3,X4,X5
ๅ†ๆจ™ๆœฌ999 = X1,X3,X4,X5,X5
ๅ†ๆจ™ๆœฌ1000 = X3,X3,X3,X3,X4

ใ€€ใ“ใ†ใ—ใฆใ€ใ‚นใƒ†ใƒƒใƒ—๏ผ‘ใงๆฑบใ‚ใŸๆ•ฐใ ใ‘ใ€้‡่ค‡ใ‚’่จฑใ—ใƒ‡ใƒผใ‚ฟๆจ™ๆœฌใ‚’ๅ…ƒใƒ‡ใƒผใ‚ฟๆ•ฐใ ใ‘ๅ–ใ‚Šๅ‡บใ™ใ“ใจใ‚’็นฐใ‚Š่ฟ”ใ—ใ€ไบบๅทฅ็š„ใซใƒ‡ใƒผใ‚ฟใ‚’ไฝœใฃใฆใ—ใพใ„ใพใ™ใ€‚

ใ€€ๅ„ๅ†ๆจ™ๆœฌใซใฉใฎใƒ‡ใƒผใ‚ฟๆจ™ๆœฌใŒๅ…ฅใ‚‹ใ‹ใฏใƒฉใƒณใƒ€ใƒ ใงใ™ใ€‚5ๅ€‹ใ™ในใฆๅŒใ˜ๅ€คใŒๅ…ฅใ‚‹ใจใใ‚‚ใ‚ใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใพใ›ใ‚“ใ—ใ€5ๅ€‹ๅ…จใฆ้•ใ†ๅ€คใŒๅ…ฅใ‚‹ใจใใ‚‚ใ‚ใ‚‹ใงใ—ใ‚‡ใ†ใ€‚

3: ๅ„ๅ†ๆจ™ๆœฌใงใ€ๆฑ‚ใ‚ใŸใ„็ตฑ่จˆ้‡ใ‚’่จˆ็ฎ—ใ™ใ‚‹ใ€‚
ใ€€ไธญๅคฎๅ€คใชใ‚Šใชใ‚“ใชใ‚Šใ€ใใ‚Œใžใ‚Œใฎๅ†ๆจ™ๆœฌใƒ‡ใƒผใ‚ฟใ‚’ไฝฟใฃใฆ็ตฑ่จˆ้‡ใ‚’ๆฑ‚ใ‚ใพใ™ใ€‚1000ๅ›žๅ†ๆจ™ๆœฌๅŒ–ใ—ใŸใ‚‰ใ€็ตฑ่จˆ้‡ใ‚‚1000ๅ€‹ใจใ‚Œใพใ™ใญใ€‚

ๅ†ๆจ™ๆœฌ1 = X1,X2,X3,X3,X4ใ€€โ†’ ็ตฑ่จˆ้‡ = S1
ๅ†ๆจ™ๆœฌ2 = X1,X1,X3,X4,X4 โ†’ ็ตฑ่จˆ้‡ = S2
ๅ†ๆจ™ๆœฌ3 = X2,X2,X5,X5,X5 โ†’ ็ตฑ่จˆ้‡ = S3
......
ๅ†ๆจ™ๆœฌ998 = X1,X2,X3,X4,X5 โ†’ ็ตฑ่จˆ้‡ = S998
ๅ†ๆจ™ๆœฌ999 = X1,X3,X4,X5,X5 โ†’ ็ตฑ่จˆ้‡ = S999
ๅ†ๆจ™ๆœฌ1000 = X3,X3,X3,X3,X4 โ†’ ็ตฑ่จˆ้‡ = S1000

ใ€€ใŸใใ•ใ‚“็ตฑ่จˆ้‡ใŒใจใ‚Œใพใ—ใŸใญใ€‚ใใ—ใฆใ€ใ“ใ†่€ƒใˆใพใ™ใ€‚

ใ€Œใ“ใฎใŸใใ•ใ‚“ใฎ็ตฑ่จˆ้‡ใฎๅˆ†ๅธƒใฃใฆใ€ใใฎ็ตฑ่จˆ้‡ใฎๆฏ้›†ๅ›ฃๅˆ†ๅธƒใจใปใผๅŒใ˜ใจ่€ƒใˆใฆใ„ใ„ใ‚ˆใญ๏ผใ ใฃใฆใฉใ†ใ›ใ‚‚ใฃใจๅฎŸ้จ“ใ—ใฆใƒ‡ใƒผใ‚ฟใ‚’ใŸใใ•ใ‚“ๅ–ใฃใŸใจใ“ใ‚ใงใ€ไผผใŸใ‚ˆใ†ใชๅ€คใฎใƒ‡ใƒผใ‚ฟใŒๅข—ใˆใ‚‹ใ ใ‘ใ ใ‹ใ‚‰ใญ๏ผ๏ผใ€

ใ€€ใคใพใ‚Šใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใงใฏใ€ๅฎŸ้จ“็š„ใซใงใฏใชใใ€็ตฑ่จˆใฎใƒˆใƒชใƒƒใ‚ฏใ‚’ไฝฟใฃใฆใƒ‡ใƒผใ‚ฟๆ•ฐใ‚’ๅข—ใ‚„ใ—ใฆใ„ใ‚‹ใฎใงใ™ใญใ€‚ใใ—ใฆใƒ‡ใƒผใ‚ฟๆ•ฐใŒใ‚ใ‚‹็จ‹ๅบฆๅคงใใ‘ใ‚Œใฐใ€ใใฎๅˆ†ๅธƒใฏ้™ใ‚Šใชใๆฏ้›†ๅ›ฃใฎๅˆ†ๅธƒใซ่ฟ‘ใ„ใฏใšใ ใ€ใจใ„ใ†ๆƒณๅฎšใ‚’ใ—ใฆใ„ใพใ™ใ€‚
ใ€€
4: ๆฑ‚ใ‚ใŸ็ตฑ่จˆ้‡ใฎๅˆ†ๅธƒใ‚’ไฝฟใฃใฆใ€ใƒใƒฉใƒ„ใ‚ญ๏ผˆๆจ™ๆบ–่ชคๅทฎใ€ไฟก้ ผๅŒบ้–“ใชใฉ๏ผ‰ใ‚’่จˆ็ฎ—ใ™ใ‚‹ใ€‚
ใ€€ไพ‹ใˆใฐๆจ™ๆบ–่ชคๅทฎใฏใ€ๆฏ้›†ๅ›ฃๅˆ†ๅธƒใฎๆจ™ๆบ–ๅๅทฎใฎใ“ใจใชใฎใงใ€ๅ˜็ด”ใซใ€ๆฑ‚ใ‚ใ‚‰ใ‚ŒใŸ็ตฑ่จˆ้‡ๅˆ†ๅธƒใฎๆจ™ๆบ–ๅๅทฎใ‚’ๆฑ‚ใ‚ใ‚Œใฐใ„ใ„ใงใ™ใญใ€‚

ใ€€ใ“ใ†ใ—ใฆใ€ใ‚ใงใŸใๆฐ—ใซใชใ‚‹็ตฑ่จˆ้‡ใฎใƒใƒฉใƒ„ใ‚ญใ‚’ๆฑ‚ใ‚ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚

ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใฎๅฎŸ่ทต

ใ€€pythonใงใ‚„ใฃใฆใฟใพใ—ใ‚‡ใ† (Matlab/Octaveใ‚‚ไธ‹ใซใ‚ใ‚Šใพใ™๏ผ‰ใ€‚

ใ€€ใพใšใ€ๅฟ…่ฆใชใƒฉใ‚คใƒ–ใƒฉใƒชใ‚’ใ‚ฒใƒƒใƒˆใ—ใพใ™ใ€‚


# import librarys
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

ใ€€Numpyใ‚’ไฝฟใฃใฆใ€ใƒ‡ใƒผใ‚ฟใ‚’ใ‚ขใƒฌใ‚คใซๅ…ฅใ‚Œใพใ™ใ€‚


# get data
iq = np.array([61, 88, 89, 89, 90, 92, 93, 
               94, 98, 98, 101, 102, 105, 108,
               109, 113, 114, 115, 120, 138])

ใ€€ๅŸบๆœฌ็ตฑ่จˆ้‡ใจใ—ใฆใ€ๅนณๅ‡ๅ€คใ€ๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎใ€ไธญๅคฎๅ€คใ‚’่จˆ็ฎ—ใ—ใพใ™ใ€‚


# compute mean, SEM (standard error of the mean) and median
mean_iq = np.average(iq)
sem_iq = np.std(iq)/np.sqrt(len(iq))
median_iq = np.median(iq)

ใ€€ใใ‚Œใžใ‚Œใ€ๅนณๅ‡ๅ€ค: 100.85ใ€ๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎ: 3.45ใ€ไธญๅคฎๅ€ค: 99.5 ใจ่จˆ็ฎ—ใงใใพใ—ใŸใ€‚

ใ€€ไธญๅคฎๅ€คใฎๆจ™ๆบ–่ชคๅทฎใ‚’ๆฑ‚ใ‚ใ‚‹ใŸใ‚ใ€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใ‚’ๅฎŸ่กŒใ—ใพใ™ใ€‚ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใŒๆญฃใ—ใใงใใฆใ‚‹ใ‹ใฎ sanity check ใจใ—ใฆใ€ไธ€ๅฟœๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎใ‚‚ไธ€็ท’ใซ่จˆ็ฎ—ใ—ใฆใ„ใพใ™ใ€‚

ใ€€pythonใงใ€1~nใฎ้€ฃ็ถšใ—ใŸๆ•ดๆ•ฐๅ€คใ‚’้‡่ค‡ใ‚ใ‚Šใงใƒฉใƒณใƒ€ใƒ ใซไธฆใณๆ›ฟใˆใ‚‹ใซใฏใ€ไปฅไธ‹ใฎใ‚ˆใ†ใซใ—ใพใ™ใ€‚


np.random.choice(n,n,replace=True)

ใ€€ใ“ใ‚Œใ‚’ไฝฟใฃใฆใ€bootstrapใฎๅ†ๆจ™ๆœฌๅŒ–ใ‚’่กŒใ„ใพใ™ใ€‚


# bootstrap to compute sem of the median
def bootstrap(x,repeats):
    # placeholder (column1: mean, column2: median)
    vec = np.zeros((2,repeats))
    for i in np.arange(repeats):
        # resample data with replacement
        re = np.random.choice(len(x),len(x),replace=True)
        re_x = x[re]
            
        # compute mean and median of the "new" dataset
        vec[0,i] = np.mean(re_x)
        vec[1,i] = np.median(re_x)
    
    # histogram of median from resampled datasets
    sns.distplot(vec[1,:], kde=False)
    
    # compute bootstrapped standard error of the mean,
    # and standard error of the median
    b_mean_sem = np.std(vec[0,:])
    b_median_sem = np.std(vec[1,:])
    
    return b_mean_sem, b_median_sem   

ใ€€ใ“ใฎ้–ขๆ•ฐใ‚’ x = iq, repeats = 1000 ใจใ—ใฆๅฎŸ่กŒใ™ใ‚‹ใจใ€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใซใ‚ˆใฃใฆๆฑ‚ใ‚ใ‚‰ใ‚ŒใŸ1000ๅ€‹ใฎไธญๅคฎๅ€คใฎๅˆ†ๅธƒใ‚’ใฟใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚

figure2.png

ใ€€โ€ฆโ€ฆใ‚‚ใฃใจ็นฐใ‚Š่ฟ”ใ—ใ‚’ๅข—ใ‚„ใ—ใŸๆ–นใŒใ‚ˆใ‹ใฃใŸใ‹ใ‚‚ใงใ™ใญใ€‚

ใ€€ใจใ‚Šใ‚ใˆใšใ€ใ“ใฎๅˆ†ๅธƒใ‹ใ‚‰ไธญๅคฎๅ€คใฎๆจ™ๆบ–่ชคๅทฎใ‚’ๆฑ‚ใ‚ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚ๅ˜ใซ1000ๅ€‹ใฎไธญๅคฎๅ€คใฎๆจ™ๆบ–ๅๅทฎใ‚’่จˆ็ฎ—ใ™ใ‚‹ใ ใ‘ใงใ™ใ€‚

  • ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใงๆฑ‚ใ‚ใ‚‰ใ‚ŒใŸไธญๅคฎๅ€คใฎๆจ™ๆบ–่ชคๅทฎ: 4.22

ใ€€ๅŒใ˜ใ‚ˆใ†ใซใ€ๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎใ‚‚่จˆ็ฎ—ใ—ใพใ™ใ€‚

  • ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใงๆฑ‚ใ‚ใ‚‰ใ‚ŒใŸๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎ: 3.42

ใ€€ๆœ€ๅˆใซใ€ๅ…ƒใƒ‡ใƒผใ‚ฟใ‹ใ‚‰ๆฑ‚ใ‚ใŸๅนณๅ‡ๅ€คใฎๆจ™ๆบ–่ชคๅทฎใฏ3.45ใชใฎใงใ€็ตๆง‹่ฟ‘ใ„ใงใ™ใญใ€‚ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใฏๆญฃใ—ใ็ตฑ่จˆ้‡ใฎใƒใƒฉใƒ„ใ‚ญใ‚’ๆŽจๅฎšใ—ใŸใ‚ˆใ†ใงใ™ใ€‚

ๆณจๆ„็‚น
ใ€€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใงใฏใƒฉใƒณใƒ€ใƒ ใซๅ†ๆจ™ๆœฌๅŒ–ใ™ใ‚‹ใŸใ‚ใ€ๅฎŸ่กŒใ™ใ‚‹ใŸใณใซ็ตๆžœใŒๅฐ‘ใ—ใšใค็•ฐใชใ‚Šใพใ™ใ€‚็ตๆžœใฎใƒใƒฉใƒ„ใ‚ญใ‚’ๆธ›ใ‚‰ใ—ใ€ใ‚ˆใ‚Šๆญฃ็ขบใชๆŽจๅฎšใ‚’ใ™ใ‚‹ใซใฏใ€็นฐใ‚Š่ฟ”ใ—ๆ•ฐใ‚’ๅข—ใ‚„ใ™ (~10,000) ๅฟ…่ฆใŒใ‚ใ‚Šใพใ™ใ€‚

ใพใจใ‚ใฎใพใจใ‚

ใ€€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใงใฏๅ†ๆจ™ๆœฌๅŒ–ใซใ‚ˆใ‚Šใ€่ˆˆๅ‘ณใฎใ‚ใ‚‹็ตฑ่จˆ้‡ใฎใƒใƒฉใƒ„ใ‚ญใ‚’ๆŽจๅฎšใ™ใ‚‹ใ“ใจใŒใงใใ‚‹ใ€‚

็ต‚ใ‚ใ‚Šใซ

ใ€€ใƒ–ใƒผใƒˆใ‚นใƒˆใƒฉใƒƒใƒ—ใ‚’ใฏใ˜ใ‚ใ€็ตฑ่จˆๅญฆใฎไธญใงใ‚‚ๅ†ๆจ™ๆœฌๅŒ– (resampling) ใ‚’่€ƒใˆใŸไบบใฏๅคฉๆ‰ใ ใจๆ€ใ„ใพใ™ใ€‚ใƒ‡ใƒผใ‚ฟใ‚’ไบบๅทฅ็š„ใซๅข—ใ‚„ใ—ใฆๆฏ้›†ๅ›ฃๅˆ†ๅธƒใ‚’่ฟ‘ไผผใ—ใฆใ—ใพใ†ใชใ‚“ใฆใ€ใจใ‚Šใ‚ใˆใšใƒ‡ใƒผใ‚ฟใ‚’ใฒใŸใ™ใ‚‰้›†ใ‚็ถšใ‘ใ‚‹่‹ฆ่กŒใ‹ใ‚‰่งฃๆ”พใ•ใ‚Œใ‚‹ใ‚ใ‘ใงใ™ใญ๏ผใ‚‚ใกใ‚ใ‚“ใ€ใ‚ใ‚‹็จ‹ๅบฆใƒ‡ใƒผใ‚ฟใ‚’ๅฎŸ้š›ใซใจใฃใฆๆฏ้›†ๅ›ฃใฎไปฃ่กจไพ‹ใ‚’้›†ใ‚ใชใ‘ใ‚Œใฐใ€ใ„ใใ‚‰ๅ†ๆจ™ๆœฌๅŒ–ใ—ใฆใ‚‚ๆฐธ้ ใซๆฏ้›†ๅ›ฃใซใฏ่ฟ‘ใฅใ‘ใพใ›ใ‚“ใŒโ€ฆโ€ฆใ€‚

ใ€€ใ‚ฝใƒผใ‚นใ‚ณใƒผใƒ‰ใฏไปฅไธ‹ใซ่ผ‰ใ›ใฆใŠใใพใ™ใ€‚

bootstrap_demo.py

# import librarys
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# get data
iq = np.array([61, 88, 89, 89, 90, 92, 93, 
               94, 98, 98, 101, 102, 105, 108,
               109, 113, 114, 115, 120, 138])

# compute mean, SEM (standard error of the mean) and median
mean_iq = np.average(iq)
sem_iq = np.std(iq)/np.sqrt(len(iq))
median_iq = np.median(iq)

# bootstrap to compute sem of the median
def bootstrap(x,repeats):
    # placeholder (column1: mean, column2: median)
    vec = np.zeros((2,repeats))
    for i in np.arange(repeats):
        # resample data with replacement
        re = np.random.choice(len(x),len(x),replace=True)
        re_x = x[re]
            
        # compute mean and median of the "new" dataset
        vec[0,i] = np.mean(re_x)
        vec[1,i] = np.median(re_x)
    
    # histogram of median from resampled datasets
    sns.distplot(vec[1,:], kde=False)
    
    # compute bootstrapped standard error of the mean,
    # and standard error of the median
    b_mean_sem = np.std(vec[0,:])
    b_median_sem = np.std(vec[1,:])
    
    return b_mean_sem, b_median_sem   

# execute bootstrap
bootstrapped_sem = bootstrap(iq,1000)    

Matlab/Ovctaveใฏไปฅไธ‹ใงใ™ใ€‚

bootstrap_demo.m

function bootstrap_demo

% data
iq = [61, 88, 89, 89, 90, 92, 93,94, 98, 98, 101, 102, 105, 108,109, 113, 114, 115, 120, 138];

% compute mean, SEM (standard error of the mean) and median
mean_iq = mean(iq);
sem_iq = std(iq)/sqrt(length(iq));
median_iq = median(iq);
disp(['the mean: ' num2str(mean_iq)])
disp(['the SE of the mean: ' num2str(sem_iq)])
disp(['the median: ' num2str(median_iq)])
disp('---------------------------------')

[b_mean_sem, b_median_sem] = bootstrap(iq, 1000);
disp(['bootstrapped SE of the mean: ' num2str(b_mean_sem)])
disp(['bootstrapped SE of the median: ' num2str(b_median_sem)])

% bootstrap to compute sem of the median
function [b_mean_sem, b_median_sem] = bootstrap(x, repeats)

% placeholder (column1: mean, column2: median)
vec = zeros(2,repeats);
for i = 1:repeats
    % resample data with replacement
    re_x = x(datasample(1:length(x),length(x),'Replace',True));
    
    % compute mean and median of the "new" dataset
    vec(1,i) = mean(re_x);
    vec(2,i) = median(re_x);
    
end

% histogram of median from resampled dataset
histogram(vec(2,:))

% compute bootstrapped standard error of the mean, and standard error of
% the median
b_mean_sem = std(vec(1,:));
b_median_sem = std(vec(2,:));

ๅ‚่€ƒ

60
49
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
katsu1110

@katsu1110

A monkey living in random forest

Comments

No comments

Let's comment your feelings that are more than good

Qiita Advent Calendar is held!

Qiita Advent Calendar is an article posting event where you post articles by filling a calendar ๐ŸŽ…

Some calendars come with gifts and some gifts are drawn from all calendars ๐Ÿ‘€

Please tie the article to your calendar and let's enjoy Christmas together!

60
49

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Login to continue?

Login or Sign up with social account

Login or Sign up with your email address