Statistics/Concepts(+codes)

Law of Large Numbers (큰 수의 법칙; LLN)

metamong 2022. 5. 5.

👨🏾‍🔬 sample data 수가 커질수록, sample의 통계치는 점점 모집단의 모수와 같아진다는 뜻!

★ 구체적이게 말하면 'the mean of your sample is going to converge to the true mean of the population or to the expected value of the random variable'

👨🏾‍🔬 일반적으로 sample의 수가 30개 이상이면 큰 수의 법칙이 적용된다고 한다

👨🏾‍🔬 너무나 당연한 내용이므로..! 빠르게 훑고 넘어가자 🧚🏾

concepts

- wikipedia -

'In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.'

👨🏾‍🔬 엄밀히 말하면 sample의 평균이 sample 갯수가 커질수록, 무한히 갈수록 전체 기댓값(모수)에 가까워진다는 뜻이다 (평균 한정)

👨🏾‍🔬 stable long-term results를 보장해주는 law이기에 중요하게 쓰임 (특히 random하게 발생하는 event와 관련할 때)

👨🏾‍🔬 당연히 많은 관측치들(시도들)이 보장되어야 함

👨🏾‍🔬 이 때, gambler's fallacy(Monte Carlo fallacy) 《특정 event의 과거빈도가 높았다면 미래에도 높게 발생하거나, 거꾸로 훨씬 적게 발생한다고 예측하는 fallacy》에 의해 ~~과거 사건에 영향을 받아 예측된다~~고 생각해서는 안된다. LLN은 매 event끼리 서로 영향을 안받고 독립적으로, 매 결과는 예측불가로 random하게 나온다고 가정

👨🏾‍🔬 Weak LLN vs. Strong LLN?

→ WLLN은 무한대의 the number of sample이 존재할 경우 모평균과 sample mean의 차이가 그 어떤 양수 ε보다도 작은 경우가 반드시(아래 Pr이 1이 된다고 제시됨) 존재한다는 뜻

→ SLLN은 무한대의 the number of sample이 존재한다면 sample mean은 무조건 모평균이 된다는 뜻

→ 그래서 WLLN이 좀 더 약하게 모평균을 대표한다고 주장한다고 말할 수 있다!

- (왼쪽부터) LLN Form - WLLN - SLLN -

w/code

① 평균이 50이고 표준편차가 10인 2만개의 sample이 따르는 정규분포를 모집단으로 가정

population = np.random.normal(50, 10, 20000) #mean 50, std 10, 1000 samples normal distribution

population.mean()
#50.03223541665855

② sample size를 5 간격으로 5에서 19995까지 늘리면서 sample의 mean을 측정 & 시각화

dat = []

for i in np.arange(start = 5, stop = 19995, step = 5) :
  s = np.random.choice(population, i)
  dat.append(s.mean())
dat

#method chaining 사용 - 메서드가 객체를 반환하게 되면, 메서드의 반환 값인 객체를 통해 또 다른 함수를 호출

(pd
 .DataFrame(dat)
 .plot(figsize=(7,7))
 .axhline(y = 50, color = '#F80909')
 );

③ 시각화 결과> 우리는 sample mean이 전체 모평균인 50을 향해 점점 converge함을 그림을 통해 알 수 있다! 🤩

* 출처1) https://www.khanacademy.org/math/statistics-probability/random-variables-stats-library/expected-value-lib/v/law-of-large-numbers

* 출처2) https://en.wikipedia.org/wiki/Law_of_large_numbers

저작자표시 비영리 변경금지

'Statistics > Concepts(+codes)' 카테고리의 다른 글

distribution》binomial distribution (이항분포) (0)	2022.05.06
Central Limit Theorem (CLT; 중심극한정리) (0)	2022.05.05
f distribution (0)	2022.05.04
Two-Samples 𝜒2 test (0)	2022.05.03
𝜒2 distribution + One-Sample 𝜒2 test (0)	2022.05.02

Law of Large Numbers (큰 수의 법칙; LLN)

concepts

w/code

'Statistics > Concepts(+codes)' 카테고리의 다른 글

댓글

티스토리툴바