Machine Learning/Fundamentals

Adjusted R-Squared vs. R-Squared

metamong 2022. 6. 19.

๐Ÿ‘จ๐Ÿพ‍๐Ÿ’ป ์˜ˆ์ „ ํฌ์ŠคํŒ…์—์„œ R-Squared์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ ์ ์ด ์žˆ์—ˆ๋Š”๋ฐ, ์ด์˜ ํ•œ๊ณ„๋ฅผ ๋ณด์™„ํ•œ adjusted r-squared์™€ ๊ฐ™์ด ๋น„๊ตํ•˜๋ฉด์„œ ์‹ฌ์ธต์ ์œผ๋กœ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž.

 

 

All About Evaluation Metrics(1/2) → MSE, MAE, RMSE, R^2

** ML ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ตœ์ข…์ ์œผ๋กœ ํ‰๊ฐ€ํ•  ๋•Œ ๋‹ค์–‘ํ•œ evaluation metrics๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ–ˆ์Œ! ** (supervised learning - regression problem์—์„œ ๋งŽ์ด ์“ฐ์ด๋Š” ํ‰๊ฐ€์ง€ํ‘œ๋“ค) - ๊ณผ์ • (5) - ๐Ÿ˜™ ๊ทธ๋Ÿฌ๋ฉด ์ฐจ๊ทผ์ฐจ..

sh-avid-learner.tistory.com

 

1> R-squared (coefficient of determination)

→ r-squared๋Š” ๋ชจ๋ธ์ด ์–ผ๋งˆ๋‚˜ ์˜ˆ์ธก๋ ฅ์ด ์ข‹์€ ์ง€, ๋ช‡ ๊ฐœ์˜ point๋ฅผ ์ž˜ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ์ง€๋ฅผ ์„ค๋ช…ํ•˜๋Š” ์ง€ํ‘œ๊ฐ€ ์•„๋‹ˆ๋‹ค.

→ = 'the proportion of variance of y that has been explained by the independent variables in the model'

→ ์ฆ‰, independent variable์˜ ๋ณ€ํ™”๋ฅผ ๊ฐ€์ง€๊ณ  y์˜ ๋ณ€ํ™”๋ฅผ ์–ผ๋งŒํผ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋Š” ์ง€ ์„ค๋ช…๋ ฅ์„ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ์ด๋‹ค!

 

→ ์œ„ ๊ทธ๋ฆผ์˜ ์‹์—์„œ ๋‹ค๋ค˜๋“ฏ์ด R-squared ๊ฐ’์€ SSR/SST์ด๋‹ค. 

 

 

โ‰ซ ์œ„ ๊ทธ๋ฆผ์—์„œ ์‹ค์ œ๊ฐ’ ๋นจ๊ฐ„ ์ ๊ณผ ๊ฒ€์€์ƒ‰ ์„ (points' mean) ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ ์ œ๊ณฑํ•ฉ(์ดˆ๋ก์„ )์ด SST - ์ฆ‰, ์˜๋ฏธ์ƒ์œผ๋กœ points์˜ y ์ž์ฒด variance

โ‰ซ ๊ทธ๋ฆฌ๊ณ  ์˜ˆ์ธก๊ฐ’ ํŒŒ๋ž€์„ (model)๊ณผ ๊ฒ€์€์ƒ‰ ์„ (points' mean) ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ ์ œ๊ณฑํ•ฉ(๋นจ๊ฐ„์„ )์ด SSR - ์ฆ‰, ์˜๋ฏธ์ƒ์œผ๋กœ ๋ชจ๋ธ๊ณผ ์‹ค์ œ y ํ‰๊ท ๊ณผ์˜ variance

์‹ค์ œ๊ฐ’ ๋นจ๊ฐ„ ์  & ์˜ˆ์ธก๊ฐ’ ํŒŒ๋ž€์„ ์ด ๊ฐ๊ฐ ๊ธฐ์ค€์ด ๋˜๋Š” (point ๋ณ„) ๊ฒ€์€ ์„ ๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ ์ œ๊ณฑํ•ฉ์ด ์„œ๋กœ ๊ฐ™๋‹ค๋ฉด ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์€ ๊ฐ™์œผ๋ฏ€๋กœ y์ž์ฒด์˜ variance์ธ SST์™€ SSR์€ ๋™์ผ, ์ฆ‰ SSR/SST = 1๋กœ R-squared ๊ฐ’์€ 1์ด ๋œ๋‹ค.

R-squared๊ฐ’์ด 1์ด๋ผ๋Š” ๊ฑด, ์˜ˆ์ธก๊ฐ’ ํŒŒ๋ž€์„  model์ด ์‹ค์ œ๊ฐ’(ํ‰๊ท )์„ ๊ธฐ์ค€์œผ๋กœ ํ•œ variance๊ฐ€ ์‹ค์ œ๊ฐ’์˜ variance์™€ ๊ฐ™๋‹ค๋Š” ๋œป์œผ๋กœ, model์ด ์‹ค์ œ๊ฐ’์˜ variance๋ฅผ ์™„๋ฒฝํžˆ ์„ค๋ช…ํ•ด์ฃผ๋Š” 100%์˜ ์„ค๋ช…๋ ฅ์„ ๊ฐ€์ง„๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

โ€ป ์ฆ‰ R-squared๊ฐ’์€ x๋“ค์˜ ๋ณ€ํ™”๋ฅผ ๊ฐ€์ง€๊ณ  y์˜ ๋ณ€ํ™”๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด R-squared ๊ฐ’์ด 80%๋ผ๋ฉด, x๋ณ€์ˆ˜์˜ ์ฆ๊ฐ€๋กœ ์ธํ•ด y๋ณ€์ˆ˜์˜ ์ฆ๊ฐ€์— ์˜ํ–ฅ์„ ๋ฏธ์นœ ์ •๋„๋Š” 80%๋ผ๊ณ  ํŒ๋‹จ์„ ๋‚ด๋ฆด ์ˆ˜ ์žˆ๋‹ค.

 

โ€ป ํ•œ๊ณ„์  - ๋งŒ์•ฝ dependent variable์— ์˜ํ–ฅ์„ ๊ฑฐ์˜ ๋ฏธ์น˜์ง€ ์•Š๋Š” ๋ณ€์ˆ˜์ด๋”๋ผ๋„ dependent variable์„ ๊ณ„์†ํ•ด์„œ ์ถ”๊ฐ€ํ• ์ˆ˜๋ก R-squared ์ง€ํ‘œ ํŠน์„ฑ์ƒ ์ œ๊ณฑํ•ฉ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋‹จ ์กฐ๊ธˆ์ด๋ผ๋„ R-squared ์ง€ํ‘œ๊ฐ’์ด ๋ฌด์กฐ๊ฑด ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋œ๋‹ค. model๊ณผ ๊ด€๋ จ์—†๋Š” ์š”์ธ์„ ๋„ฃ์—ˆ๋Š”๋ฐ๋„ ์ด์ „๋ณด๋‹ค R-squared๊ฐ’์ด ์ƒ์Šนํ–ˆ๋‹ค๋Š” ๊ฑด ๋งค์šฐ ์‹ฌ๊ฐํ•œ ํ•œ๊ณ„..

2> adjusted r-squared

๐Ÿ‘จ๐Ÿพ‍๐Ÿ’ป ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” adjusted r-squared ์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•ด r-squared์˜ ํ•œ๊ณ„๋ฅผ ๋ณด์™„ํ–ˆ๋‹ค.

 

โ€ป ์ƒˆ๋กœ์šด term์ด ์ถ”๊ฐ€๋  ๋•Œ ์šฐ๋ฆฌ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ๋‘ ๊ฐ€์ง€ case๋ฅผ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋‹ค.

(์—ฌ๊ธฐ์„œ term์ด ์ถ”๊ฐ€๋œ๋‹ค๋Š” ๊ฒƒ์€ polynomial regression model์˜ term ์ถ”๊ฐ€๋ฅผ ๋งํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ independent variable์˜ ๊ฐœ์ˆ˜๋ฅผ ๋งํ•œ๋‹ค.)

 

(ํ•˜๋‹จ ๊ฒŒ์‹œ๊ธ€์„ ๋ณด๋ฉด ์•Œ๊ฒ ์ง€๋งŒ, polynomial regression model์—์„œ term ๊ฐœ์ˆ˜ ์ฆ๊ฐ€ํ•œ๋‹ค๊ณ  r-squared๊ฐ€ ๋ฌด์กฐ๊ฑด์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฑด ์•„๋‹˜์„ ํ™•์ธ ๊ฐ€๋Šฅํ•˜๋‹ค.)

 

Polynomial Regression Model

* Linear Regression Model - ์ฆ‰, y ์ข…์†๋ณ€์ˆ˜์™€ x ๋…๋ฆฝ๋ณ€์ˆ˜(1๊ฐœ ๋˜๋Š” 2๊ฐœ ์ด์ƒ)๋“ค ๊ฐ„์˜ ๊ด€๊ณ„๊ฐ€ ์„ ํ˜•์ธ ๊ฒฝ์šฐ๋ฅผ ๋œปํ•œ๋‹ค. ์ฆ‰, x ๋…๋ฆฝ๋ณ€์ˆ˜์˜ ์ฆ๊ฐ ๋ณ€ํ™”์— ๋”ฐ๋ผ y๋„ ์ด์— ์ƒ์‘ํ•˜์—ฌ ์ฆ๊ฐ์ด ์ผ์ •ํ•œ ์ˆ˜์น˜์˜ ํญ์œผ๋กœ

sh-avid-learner.tistory.com

 

→ โ‘  ๊ณผ์ ํ•ฉ์ด ์˜ˆ์ƒ๋˜๊ฑฐ๋‚˜ ์˜ˆ์ธก๋ ฅ์„ ์˜คํžˆ๋ ค ๊ฐ์†Œ์‹œํ‚ค๋Š” predictor ์ง€ํ‘œ๊ฐ€ ๊ธฐ์กด model์— ์ฒจ๊ฐ€๋  ๊ฒฝ์šฐ ์ž์ฒด penalty๋ฅผ ๋ถ€๊ณผํ•ด adjusted r^2๊ฐ’ ๊ฐ์†Œ

→ โ‘ก ๊ทธ ๋ฐ˜๋Œ€๋กœ ์˜คํžˆ๋ ค model์˜ ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” predictor ์ง€ํ‘œ๊ฐ€ ์ฒจ๊ฐ€๋˜๋ฉด adjusted r^2๊ฐ’ ์ฆ๊ฐ€

 

$R^2 = 1 - [\cfrac{(1-R^2)(n-1)}{n-k-1}]$

(k: independent variable ๊ฐœ์ˆ˜ = term ํ•ญ ๊ฐœ์ˆ˜ / n: point ๊ฐœ์ˆ˜)

 

→ ์ผ๋ฐ˜์ ์œผ๋กœ adjusted r-squared๊ฐ’์€ ํ•ญ์ƒ ์–‘์ˆ˜์ธ ๊ฒฝ์šฐ๊ฐ€ ๋Œ€๋ถ€๋ถ„์ด๋ฉฐ, ๋Œ€์ฒด์ ์œผ๋กœ r-squared๊ฐ’๋ณด๋‹ค ์ˆ˜์น˜๊ฐ€ ์ ์€๊ฐ’์œผ๋กœ ์ธก์ •๋œ๋‹ค.

→ adjusted ์ง€ํ‘œ๋ฅผ ๋” ๋งŽ์ด ์‚ฌ์šฉํ•˜๋ฉฐ, adjusted ์ง€ํ‘œ๋Š” ๋‹ค์–‘ํ•œ independent variable์˜ ์กฐํ•ฉ์„ ๊ณ ๋ คํ•˜๊ณ  model์— ๋“ค์–ด๊ฐˆ ์ตœ์ ์˜ independent variable ๊ฐœ์ˆ˜๋ฅผ ์•Œ์•„๋‚ผ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

 

→ adjusted $R^2$ ์‹ ์„ค๋ช…: ์œ„ โ‘ ์˜ case๋ผ๋ฉด ๊ธฐ์กด R^2๋ณด๋‹ค k๊ฐ’์ด ์ฆ๊ฐ€ํ•˜๋Š”๋ฐ ๋” ํฐ ํญ์„ ์ฐจ์ง€ํ•˜๊ธฐ์— ๊ฒฐ์ •์ ์œผ๋กœ adjusted๊ฐ’์€ ๊ฐ์†Œ / โ‘ก์˜ case๋ผ๋ฉด ๊ธฐ์กด R^2 ์ฆ๊ฐ€ํญ์ด k๊ฐ’ ์ฆ๊ฐ€ํญ์„ ๋„˜๊ธฐ์—(outweigh) ๊ฒฐ์ •์ ์œผ๋กœ adjusted๊ฐ’์€ ์ฆ๊ฐ€

์‹ค์Šต w/python>

1> dataset ์ค€๋น„ + import + ์ „์ฒ˜๋ฆฌ

 

import numpy as np
import pandas as pd
import matplotlib
import seaborn as sns
import matplotlib.pyplot as pyplot
from matplotlib import pyplot as plt

from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

 

#read in data
data = pd.read_csv("./data/nba_logreg.csv")

data.dropna(inplace=True) #๊ฒฐ์ธก์น˜ ์žˆ๋Š” ํ–‰ ์‚ญ์ œ
data.drop(columns=['Name'],inplace=True) #์ด๋ฆ„ column ์‚ญ์ œ

 

2> target vector ์ถ”์ถœ + feature vector๋Š” ๋‹ค์„ฏ๊ฐœ๋งŒ ์ถ”์ถœ

 

#y target vector ๋”ฐ๋กœ ๋นผ๋†“๊ธฐ
y = data['TARGET_5Yrs'].copy()
data.drop(columns=['TARGET_5Yrs'],inplace=True)

data_cut = data.iloc[:,:5]

 

3> adjusted r-squared์™€ r-squared์˜ ์ฐจ์ด๋ฅผ ์•Œ์•„๋ณด๊ธฐ ์œ„ํ•ด dataframe์— ๋žœ๋คํ•˜๊ฒŒ ์ƒ์„ฑ๋œ ๋‘ column์„ ๊ฐ–๋‹ค๋ถ™์ธ๋‹ค.

 

np.random.seed(11)
data_cut['random1'] = np.random.randn(len(data_cut))
data_cut['random2'] = np.random.randint(len(data_cut))

 

4> ์ง์ ‘ dataframe ์™ผ์ชฝ column๋ถ€ํ„ฐ ๋ˆ„์ ์œผ๋กœ independent variables ๋ชฉ๋ก ์ถ”๊ฐ€ - r-squared์™€ adjusted r-squared ์—ฐ์‚ฐํ•˜๊ธฐ

 

r_squared = []
adj_r_squared = []
for i in range(1, data_cut.shape[1]+1):
    
    data_new = data_cut.iloc[:,:i].copy()
    
    linear_regression = LinearRegression()
    
    linear_regression.fit(data_new, y)
    
    prediction = linear_regression.predict(data_new)
    
    r2 = r2_score(y_true=y, y_pred=prediction)
    r_squared.append(r2)
    
    adj_r2 = 1 - ((1 - r2) * (len(data) - 1) / (len(data) - i - 1))
    adj_r_squared.append(adj_r2)

 

5> ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”

 

result = pd.DataFrame({'r^2':r_squared, 'adjusted_r^2':adj_r_squared})

fig = plt.figure(figsize=(7,6))
sns.set_style('ticks')
sns.set_context("poster", font_scale = .8, rc={"grid.linewidth": 0.2})
s = sns.lineplot(data=result);
plt.axvline(4.0, color='red', linewidth=4)
s.set_xticks(range(8))
s.set_xticklabels(['1', '2','3','4','5','6','7','8']);

 

 

6> ๊ฒฐ๊ณผ ๋ถ„์„

โ–ถ ์‹œ๊ฐํ™” ๊ฒฐ๊ณผ adjusted๊ฐ’์€ r^2๋ณด๋‹ค ์–ธ์ œ๋‚˜ ๊ฐ’์ด ์ž‘๊ฒŒ ์ธก์ •๋จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

โ–ถ independent variable ๊ฐœ์ˆ˜๊ฐ€ 6์ด ๋˜๋Š” ๋•Œ๋ถ€ํ„ฐ ์‹ค์ œ๋กœ model๊ณผ ๊ด€๋ จ์—†๋Š” ๋žœ๋คํ•œ data๋ฅผ ์ง‘์–ด๋„ฃ์€ ์‹œ์ ์ธ๋ฐ, ์‹ค์ œ๋กœ adjusted ์ˆ˜์น˜๋Š” ํ•ด๋‹น data๊ฐ€ model ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์˜๋ฏธ๊ฐ€ ์—†๋‹ค ํŒ๋‹จํ•˜์—ฌ ์ˆ˜์น˜๊ฐ’์ด ๊ฐ์†Œํ•จ์„ ์œก์•ˆ์œผ๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

โ–ถ ํ•˜์ง€๋งŒ, ๋ฐ˜๋Œ€๋กœ ์•„๋ฌด๋ฆฌ ๋ฌด์˜๋ฏธํ•œ variable์„ ์ง‘์–ด๋„ฃ์—ˆ์–ด๋„ r^2๊ฐ’์€ ๊ณ„์† ์ฆ๊ฐ€ํ•˜๋Š” ํ•œ๊ณ„๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.

โ–ถ ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๊ฐ€ ์–ด๋–ค data์ธ์ง€ ๋ชจ๋ฅธ๋‹ค๋Š” ๊ฐ€์ • ํ•˜, ๋‹จ์ˆœํžˆ adjusted r^2 ์ˆ˜์น˜๋งŒ์„ ํ†ตํ•ด์„œ ์šฐ๋ฆฌ๋Š” randomํ•˜๊ฒŒ ๋„ฃ์€ ๋‘ variable์„ model ๊ตฌ์ถ•์— ์ œ์™ธ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์žฅ์ ์ด ์กด์žฌํ•œ๋‹ค. (์œ„ ๊ทธ๋ž˜ํ”„์˜ ๊ฒฝ์šฐ 4๋ฒˆ์งธ variable์„ ๋„ฃ์—ˆ์„ ๋•Œ ์˜คํžˆ๋ ค adjusted๊ฐ’์ด ๊ฐ์†Œํ–ˆ๋Š”๋ฐ, ์ด๋ฅผ ํ†ตํ•ด 4๋ฒˆ์งธ variable๋งŒ ๋ชจ๋ธ ๊ตฌ์ถ•์— ์ œ์™ธ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.)


* ์ถœ์ฒ˜1) ๋น„๊ต https://www.investopedia.com/ask/answers/012615/whats-difference-between-rsquared-and-adjusted-rsquared.asp

* ์ถœ์ฒ˜2) adjusted r-squared ๊ฐœ๋… https://youtu.be/_I7sKr77Ci8

* ์ถœ์ฒ˜3) r-squared ๊ฐœ๋… https://www.youtube.com/watch?v=IMjrEeeDB-Y 

* ์ถœ์ฒ˜4) python ์‹ค์Šต https://www.statology.org/adjusted-r-squared-in-python/

* ์ถœ์ฒ˜5) ์‹œ๊ฐํ™” code ์ผ๋ถ€ ์ฐธ์กฐ https://towardsdatascience.com/demystifying-r-squared-and-adjusted-r-squared-52903c006a60

* ์ถœ์ฒ˜6) r-squard ๊ฐœ๋… <statquest> https://www.youtube.com/watch?v=2AQKmw14mHM 

'Machine Learning > Fundamentals' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

All About Evaluation Metrics (2/2) โ†’ MAPE, MPE  (0) 2022.06.11
Unsupervised Learning  (0) 2022.06.03
PCA(w/code)  (0) 2022.05.31
PCA(concepts)  (0) 2022.05.30
Feature Selection vs. Feature Extraction  (0) 2022.05.18

๋Œ“๊ธ€