Machine Learning/Fundamentals

Feature Selection vs. Feature Extraction

metamong 2022. 5. 18.

๐Ÿงœ‍โ™‚๏ธ ๋น…๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ feature ์ˆ˜๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์•„์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐ ๋ณต์žกํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” ์ ์ ˆํ•˜๊ฒŒ ์ค‘์š” feature๋งŒ ์„ ํƒํ•˜๊ฑฐ๋‚˜, ๊ธฐ์กด feature๋“ค์„ ์žฌ์กฐํ•ฉํ•œ ์ผ๋ถ€ feature๋งŒ ์„ ํƒํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๐Ÿงœ‍โ™‚๏ธ for dimensionality reduction.. → ๋ชจ๋ธ ๋ณต์žก์„ฑ ๊ฐ์†Œ, ๊ณผ์ ํ•ฉ ํ˜„์ƒ ๋ฐฉ์ง€, ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ ํ–ฅ์ƒ, ๋ชจ๋ธ computation ํšจ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๊ธฐ์กด feature๋ฅผ ์ด์šฉํ•ด selection & extraction์„ ์ง„ํ–‰ํ•œ๋‹ค

 

๐Ÿงœ‍โ™‚๏ธ ๋ฐ์‹ธ๋ฅผ ๊ณต๋ถ€ํ•œ๋‹ค๋ฉด ๋ฐ˜๋“œ์‹œ! ๊ฑฐ์ณ๊ฐ€์•ผ ํ• , ํ•™์Šตํ•ด์•ผ ํ•  ๋‘ ๊ฐ€์ง€ technique

 

- feature extraction(ํ•˜๋‹จ ์ขŒ) & feature selection(ํ•˜๋‹จ ์šฐ) -

 

 

๐Ÿงœ‍โ™‚๏ธ feature extraction์€ ๊ธฐ์กด์˜ feature๋“ค์„ ํ† ๋Œ€๋กœ ์ตœ๋Œ€ํ•œ ์ค‘์š”ํ•œ ์ •๋ณด๋งŒ ์•Œ์งœ๋ฐฐ๊ธฐ๋กœ ๋ชจ์•„ ์ƒˆ๋กœ์šด feature๋“ค์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด๊ณ , feature selection์€ ๊ธฐ์กด feature๋“ค ์ผ๋ถ€๋งŒ ๊ทธ๋Œ€๋กœ ์„ ํƒํ•œ๋‹ค.

'while the original features are maintained in the case of feature selection algorithms, the feature extraction algorithms transform the data onto a new feature space.'

 

๐Ÿงœ‍โ™‚๏ธ feature selection - ์„ ํƒ๋œ feature ํ•ด์„์ด ์‰ฝ๋‹ค & ํ•˜์ง€๋งŒ feature ๊ฐ„์˜ ์—ฐ๊ด€์„ฑ์ด ๊ณ ๋ ค๋˜์ง€ ์•Š์€ ์ฑ„๋กœ feature ์ผ๋ถ€๋ฅผ selectํ•˜์˜€์œผ๋ฏ€๋กœ ์ด๋ฅผ ๊ผญ ์•Œ๊ณ  ์žˆ์–ด์•ผ ํ•จ

 

๐Ÿงœ‍โ™‚๏ธ feature extraction - feature ๊ฐ„์˜ ์—ฐ๊ด€์„ฑ์„ ๊ณ ๋ คํ•œ ์ฑ„๋กœ feature ์ˆ˜๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ƒˆ๋กญ๊ฒŒ ๋งŒ๋“  feature๋“ค์ด๋ฏ€๋กœ feature ํ•ด์„์ด ์–ด๋ ต๋‹ค๋Š” ๋‹จ์ ์ด ์กด์žฌ

 

๐Ÿงœ‍โ™‚๏ธ ์‚ฌ์šฉ case๊ฐ€ ๋‹ค๋ฆ„!

(1) feature selection์€ ๋ชจ๋ธ์„ ์„ค๋ช…ํ•˜๋Š” ๊ฒŒ ์ค‘์š”ํ•  ๋•Œ ์ฃผ๋กœ ์“ฐ์ด๊ณ (๊ธฐ์กด feature ์ผ๋ถ€๋ฅผ ๊ทธ๋Œ€๋กœ ๋ณด์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์—)

(2) feature extraction์€ ๋ชจ๋ธ์˜ ์˜ˆ์ธก์„ฑ๋Šฅ์„ ๋†’์ผ ๋•Œ ์ฃผ๋กœ ์“ฐ์ธ๋‹ค(์ง์ ‘ ์—ฌ๋Ÿฌ feature๋ฅผ ์žฌ์กฐํ•ฉํ•ด์„œ ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” ๊ฒŒ ๊ด€๊ฑด์ด๊ธฐ ๋•Œ๋ฌธ). 

1. feature selection

โ‘  ๊ทœ์ œํ™” - regularization - L1 norm์„ ํ†ตํ•ด์„œ (LASSO) ์ผ๋ถ€ feature๋งŒ ์„ ํƒํ•  ์ˆ˜ ์žˆ๋‹ค

 

โ‘ก ๊ณง ๋ฐฐ์šธ estimator - randomforestclassifier๋ฅผ ์ด์šฉํ•ด feature_importances_ ์†์„ฑ์„ ๊ธฐ์ค€์œผ๋กœ feature ์„ ํƒ๋„ ๊ฐ€๋Šฅํ•˜๋‹ค

 

โ‘ข greedy search ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์‚ฌ์šฉ (genetic algorithm ๋“ฑ๋“ฑ)

 

โ‘ฃ sklearn์˜ feature_selection module - selectKBest ์‚ฌ์šฉ

 

feature selection (1) - selectKBest (+jointplot)

๐Ÿคณ ์˜ˆ์ „ ํฌ์ŠคํŒ…์—์„œ 'ํŠน์„ฑ๊ณตํ•™(feature engineering)'์ด ๋ฌด์—‡์ธ์ง€์— ๋Œ€ํ•ด ๊ฐ„๋žตํ•˜๊ฒŒ ๊ฐœ๋… ํ•™์Šต์„ ํ•˜์˜€๋‹ค. FE - Feature Engineering 1. Concepts * In real world, data is really messy - we need to clean the d..

sh-avid-learner.tistory.com

 

โ˜… ๋” ์ž์„ธํ•œ feature selection์€ ์•„๋ž˜ scikit-learn docu ํ™ˆํŽ˜์ด์ง€๋ฅผ ํ†ตํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค! (๊ณง ์ •๋ฆฌํ•ด์„œ ํฌ์ŠคํŒ… ์˜ˆ์ •)

https://scikit-learn.org/stable/modules/feature_selection.html#univariate-feature-selection

2. feature extraction

๐Ÿ– ์ผ๋‹จ feature extraction์˜ ๋ชฉํ‘œ๋Š” ์ตœ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์žƒ์ง€ ์•Š๋Š” ์„ ์—์„œ data๋ฅผ ์ตœ๋Œ€ํ•œ ์••์ถ•ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

 

โ‘  PCA - Principal Component Analysis - eigenvector๋ฅผ ์ด์šฉํ•ด ์—ฌ๋Ÿฌ ์ฃผ์ถ•์„ ์ฐพ๊ณ  ์ฃผ์ถ• ๊ธฐ๋ฐ˜ feature๋ฅผ extractํ•˜๋Š” ๊ณผ์ • (๊ณง ๋ฐฐ์›€!)

 

โ‘ก auto-encoder

 

๐Ÿ– ๊ทธ๋ ‡๋‹ค๋ฉด feature engineering๊ณผ feature extraction๊ณผ์˜ ์ฐจ์ด์ ์€?

 

 

FE - Feature Engineering

1. Concepts * In real world, data is really messy - we need to clean the data * FE = a process of extracting useful features from raw data using math, statistics and domain knowledge - ์ฆ‰, ๋„๋ฉ”..

sh-avid-learner.tistory.com

 

๊ธฐ์กด raw data ์˜›๋‚  data๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€ ์—†๋Š” ๊ฐ€์˜ ์ฐจ์ด์ด๋‹ค! FE์˜ ๊ฒฝ์šฐ ์›๋ž˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” feature๋“ค์„ ๊ฐ€์ง€๊ณ  ์ง์ ‘ ์žฌ์กฐํ•ฉํ•ด ์ƒˆ๋กœ์šด feature๋ฅผ ๋งŒ๋“œ๋Š” ๊ณผ์ •์ด๊ณ , feature extraction์€ ๊ธฐ์กด raw data๋Š” ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๊ณ , ํ•ด๋‹น data์—์„œ ๋ฐ˜๋“œ์‹œ ์œ ์šฉํ•œ ์ •๋ณด๋งŒ์„ ๋ฝ‘์•„ ์žฌ์กฐํ•ฉํ•ด ์ƒˆ๋กœ์šด data๋ฅผ ๊ฐ€์ง€๊ณ  ์ง„ํ–‰ํ•ด์•ผ๋งŒ ํ•˜๋Š” ๊ณผ์ •์ผ ๋•Œ feature extraction์ด๋ผ๊ณ  ํ•œ๋‹ค

 

* ๊ฐœ๋… ์ˆ™์ง€ ์™„๋ฃŒ!

3. feature selection vs. feature extraction ์ •๋ฆฌ

  feature selection feature extraction
์„ค๋ช… ์›๋ž˜ ํŠน์„ฑ๋“ค ์ค‘ ์ผ๋ถ€๋งŒ ์„ ํƒ ์›๋ž˜ ํŠน์„ฑ๋“ค์˜ ์ •๋ณด๋ฅผ ํ™œ์šฉ - ์ƒˆ๋กœ์šด ํŠน์„ฑ ์ƒ์„ฑ
์ƒˆ๋กœ์šด ํŠน์„ฑ
์ƒ์„ฑ ์—ฌ๋ถ€
x O
๋‹จ์  ์„ ํƒ๋˜์ง€ ์•Š์€ ํŠน์„ฑ๋“ค์€ ๋ถ„์„์—์„œ ์‚ฌ์šฉ x ์ƒˆ๋กญ๊ฒŒ ๋งŒ๋“  feature ํ•ด์„์˜ ์–ด๋ ค์›€

* ์ธ๋„ฌ ์ถœ์ฒ˜) https://www.vectorstock.com/royalty-free-vector/extracting-data-color-icon-vector-29110777

* ์ถœ์ฒ˜1) https://vitalflux.com/machine-learning-feature-selection-feature-extraction/

* ์ถœ์ฒ˜2) https://www.researchgate.net/figure/Difference-between-feature-extraction-and-feature-selection_fig1_339209170

* ์ถœ์ฒ˜3) https://stackoverflow.com/questions/39130600/what-is-the-difference-between-feature-engineering-and-feature-extraction

'Machine Learning > Fundamentals' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

PCA(w/code)  (0) 2022.05.31
PCA(concepts)  (0) 2022.05.30
feature selection (1) - selectKBest (+jointplot)  (0) 2022.04.20
Ordinal Encoding  (0) 2022.04.20
train vs. validation vs. test set  (0) 2022.04.18

๋Œ“๊ธ€