Deep Learning/Experiments

๐Ÿ‘” Tuning fashionMNIST Classifier I: Adjusting the Number of epochs

metamong 2024. 8. 2.

๐Ÿ‘” ์œ ๋ช…ํ•œ fashionMNIST Classifier์— Convolution๊ณผ Pooling์„ ์ ์šฉํ•œ ํ›„ ์„ฑ๋Šฅ ์ฆ๊ฐ€๋ฅผ ์ง์ ‘ ๊ฒฝํ—˜ํ•ด๋ณด์•˜๋‹ค. ์ด์ œ ๋‹ค์–‘ํ•œ model tuning๊ณผ epochs๋ฅผ ์กฐ์ ˆํ•˜๋ฉด์„œ ์†Œํญ์˜ ์„ฑ๋Šฅ ์ฆ๊ฐ€์— ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ๋Š” ์ง€ ์•Œ์•„๋ณด๋„๋ก ํ•ด๋ณด์ž!

 

๐Ÿ‘” ๋ชจ๋“  ์ฝ”๋“œ๋Š” GitHub ํด๋ฆญ!

 

๐Ÿ‘” default fashionMNIST Classifier ๋ชจ๋ธ: <Convolution((3,3) x 32) + MaxPooling(2,2)> 2๊ฐœ ์—ฐ์‚ฐ → Flatten() → Dense(128, relu) → Dense(10, softmax)

โ‘  Adjusting the Number of Epochs

๐Ÿ‘” epochs๋ž€ ์ฃผ์–ด์ง„ batch_size์— ๋งž๋Š” ์—ฌ๋Ÿฌ batch๋ฅผ ๋ฏธ๋ฆฌ ๋งŒ๋“ค๊ณ  ๋‚œ ํ›„, ๋™์ผ batch training์„ ๋ฐ˜๋ณต์ ์œผ๋กœ ๋ช‡ ๋ฒˆ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฐ€์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด 5 epochs๋ผ๋ฉด ๋™์ผ batches๋ฅผ 5๋ฒˆ ์ˆ˜ํ–‰ํ•˜๊ณ , 10 epochs๋ผ๋ฉด ๋™์ผ batches๋ฅผ 10๋ฒˆ training ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ฆ‰ ๋™์ผ dataset์„ ๋ช‡ ๋ฒˆ ํ•™์Šตํ•˜๋Š” ์ง€, epochs์˜ ์ˆ˜๋กœ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. epochs ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋ฉด์„œ learning weight์„ ์„ธ๋ถ€์ ์œผ๋กœ ์กฐ์ •ํ•ด๊ฐ€๋ฉฐ ํ•™์Šต ์„ฑ๋Šฅ์„ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค(batch๋Š” ๊ณ ์ •์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด 4๊ฐœ์˜ ๋ฐ์ดํ„ฐ 1, 2, 3, 4๊ฐ€ ์ด๊ณ  2๊ฐœ์˜ batches๊ฐ€ ์žˆ๋‹ค๋ฉด ๊ฐ๊ฐ์˜ batch์— 1, 3 ๊ทธ๋ฆฌ๊ณ  2, 4 ์ด๋ ‡๊ฒŒ ์žˆ๋‹ค๋ฉด epochs๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ batches๋Š” ๊ณ ์ •). ํ•˜์ง€๋งŒ ํ•ด๋‹น training set์— ๊ณผ์ ํ•ฉ๋˜์–ด ํ•™์Šตํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๊ณผ๋„ํ•œ ํ•™์Šต์€ ํ”ผํ•ด์•ผ ํ•œ๋‹ค. ์•„๋ž˜ ์„ฑ๋Šฅ์˜ ๊ฒฐ๊ณผ๋กœ ์ด๋ฅผ ์œ ์ถ”ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๐Ÿ‘” Convolution๊ณผ Pooling์„ ์ ์šฉํ•œ fashionMNIST Classifier์˜ epochs๋ฅผ ์กฐ์ ˆํ•˜๋ฉด์„œ accuracy in evaluation, loss in evaluation ๊ทธ๋ฆฌ๊ณ  average training time per epoch์— ์–ด๋–ค ๋ณ€ํ™”๊ฐ€ ์ƒ๊ธฐ๋Š” ์ง€ ๊ทธ๋ž˜ํ”„๋กœ ์‹œ๊ฐํ™” ํ•ด๋ณด์ž. epochs๋ฅผ 5๋ถ€ํ„ฐ 50๊นŒ์ง€ 5๋‹จ์œ„๋กœ ์ฆ๊ฐ€ํ•˜๋ฉด์„œ 3๊ฐœ metrics์˜ ๋ณ€ํ™”๋ฅผ ์‚ดํŽด๋ณด์•˜๋‹ค.

โ‘  epochs๊ฐ€ ์ฆ๊ฐ€ํ• ์ˆ˜๋ก accuracy ์ •ํ™•๋„์˜ ํ™•์—ฐํ•œ ์ฆ๊ฐ€๋Š” ์—†์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. epochs = 5์—์„œ epochs = 30์œผ๋กœ ์ง„ํ–‰ํ•˜๋ฉด์„œ accuracy๊ฐ€ ๋Œ€์ฒด์ ์œผ๋กœ ์ฆ๊ฐ€ ์ถ”์„ธ๋ฅผ ๋ณด์ด๋‹ค๊ฐ€ epochs = 30 ์ผ๋•Œ ์ตœ๋Œ“๊ฐ’ 0.9113์„ ์ฐ์—ˆ๋‹ค. epochs=35 ์ผ๋•Œ accuracy ์ตœ์†Ÿ๊ฐ’ 0.8988

 

โ‘ก ๊ฐ€์žฅ ๋‚ฎ์€ accuracy๋ฅผ ๋ณด์ด๋Š” epochs=35์ธ ๊ฒฝ์šฐ์™€ ์ตœ๋Œ“๊ฐ’ accuracy๋ฅผ ๋ณด์ด๋Š” epochs=30์ธ ๊ฒฝ์šฐ์˜ ๊ฐ๊ฐ์˜ accuracy๋Š” 0.9113๊ณผ 0.8988๋กœ |max accuracy - min_accuracy|๋Š” 0.0125. ๋”ฐ๋ผ์„œ ์ตœ๋Œ“๊ฐ’ accuracy ๋Œ€๋น„ ์ „์ฒด accuracy์˜ ํŽธ์ฐจ๋Š” ์•ฝ 1.372%์˜ ๋ณ€๋™์„ฑ์— ๊ทธ์นœ๋‹ค.

 

โ‘ข model loss์˜ ๊ฒฝ์šฐ epochs๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ loss๊ฐ’๋„ ํ™•์—ฐํžˆ ๋น„๋ก€๋ฐฉํ–ฅ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. model loss์˜ ํ™•์—ฐํ•œ ์ฆ๊ฐ€์˜ ์›์ธ์„ ์ƒ๊ฐํ•ด๋ณผ ํ•„์š”๊ฐ€ ์žˆ๋‹ค.

 

์ถ”์ •์›์ธ) โ‘ข-1 overfitting์˜ ๊ฐ€๋Šฅ์„ฑ

: ํ•™์ŠตํšŸ์ˆ˜ epochs๋ฅผ ๊ณ„์† ๋Š˜๋ฆฌ๋‹ค๋ณด๋ฉด training dataset์— ์ตœ์ ํ™”๋˜์–ด์„œ evaluation ๊ณผ์ •์—์„œ ์˜คํžˆ๋ ค ์—ญํšจ๊ณผ๋ฅผ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ํŠนํžˆ ํ•ด๋‹น ๋ฐ์ดํ„ฐ์…‹์—๋Š” 60000๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋งŒ ์กด์žฌํ•˜๋ฏ€๋กœ accuracy ๊ฐ’์—๋Š” ํฐ ์ฐจ์ด๊ฐ€ ์—†์ง€๋งŒ, loss ๊ฐ’ ๊ฐ™์€ ๊ฒฝ์šฐ prediction๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์ฐจ์ด๋ฅผ ๊ทน๋ช…ํžˆ ๋ณด์—ฌ์ฃผ๋Š” ์ˆ˜์น˜์ด๊ธฐ์—, overfitting์œผ๋กœ ์ธํ•ด loss๊ฐ€ ๊ทน๋ช…ํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•˜๋Š” ๋ถ€๋ถ„์„ ๊ทธ๋ž˜ํ”„๋ฅผ ํ†ตํ•ด์„œ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

์ถ”์ •์›์ธ) โ‘ข-2 high initial accuracy

: ์ฒ˜์Œ์— ์–ด๋А์ •๋„ ์ข‹์€ accuracy ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค๋ฉด, ์ดํ›„ epochs๋ฅผ ์ฆ๊ฐ€์‹œํ‚จ๋‹ค ํ•˜๋”๋ผ๋„ ์˜คํžˆ๋ ค training set์— ์ตœ์ ํ™”๋˜์–ด test set๊ณผ์˜ loss๊ฐ’์€ ๊ทน๋ช…ํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•  ํ™•๋ฅ ์ด ๋†’๋‹ค.

 

์ถ”์ •์›์ธ) โ‘ข-3 imbalanced dataset(ํ•˜์ง€๋งŒ ํ•ด๋‹น case์—์„œ ์›์ธ์€ ์•„๋‹˜)

: classification task์—์„œ (์œ„ task๋Š” fashionMINST classifier๋กœ classification task์ด๋‹ค) ์• ์ดˆ์— imbalancedํ•œ dataset์„ ๋†“๊ณ  ์ง„ํ–‰ํ•œ๋‹ค๋ฉด accuracy์™€ ๋‹ค๋ฅด๊ฒŒ loss๋Š” ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒฝํ–ฅ์„ ๋ณด์ด๊ณค ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Š” ํ•ด๋‹น task์—์„œ๋Š” ๋งž์ง€ ์•Š๋‹ค. ์•„๋ž˜ ์ฝ”๋“œ ๊ฒฐ๊ณผ๋ฅผ ๋ณธ๋‹ค๋ฉด, ๊ฐ class๋ณ„ ๋ถ„ํฌ๊ฐ€ ๋™์ผํ•˜๊ฒŒ ๋˜์–ด ์žˆ๋‹ค.

for num in set(training_labels):
    print(list(training_labels).count(num)) #6000 each

 

โ‘ฃ epochs๋ณ„ dataset training time์€ ํฐ ๋ณ€ํ™”๊ฐ€ ์—†๋‹ค. epoch๊ฐ€ ์ง„ํ–‰๋  ๋•Œ๋งˆ๋‹ค epoch๋ณ„ ๋ชจ๋ธ์ด ํ•™์Šตํ•˜๋Š” ์†๋„๋Š”, epoch ์ˆ˜์™€ ์ƒ๊ด€์ด ์—†์Œ์„ ์œ ์ถ”ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๐Ÿ‘” epochs ์‹คํ—˜ ๊ฒฐ๊ณผ: epochs 10์—์„œ best performance. loss๊ฐ’์€ ๊ณ„์† ์ฆ๊ฐ€ํ•˜๋Š” ์ƒํƒœ์—์„œ 15์ผ ๋•Œ epochs๊ฐ€ ์ •์ ์„ ์ฐ์—ˆ๋‹ค๊ฐ€ 20๋ถ€ํ„ฐ ์•ฝ๊ฐ„ ๊ฐ์†Œํ•œ๋‹ค. ํ•˜์ง€๋งŒ epochs 15์—์„œ์˜ loss๊ฐ’์€ 0.3426์œผ๋กœ epoch 10์—์„œ์˜ loss๊ฐ’ 0.2872๋ณด๋‹ค ํฐ ํญ์œผ๋กœ ์ฆ๊ฐ€. ๋ฐ˜๋ฉด์— accuracy๋Š” 0.9046๊ณผ 0.9077๋กœ loss ๋ณ€ํ™”ํญ์— ๋น„ํ•ด ํฐ ์ฐจ์ด๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— epochs 10์„ best performance epochs๋กœ ์„ค์ •. ์ดํ›„ model tuning ์‹คํ—˜์—์„œ epochs๋ฅผ ๋ชจ๋‘ 10์œผ๋กœ ์„ค์ •ํ•˜๊ณ  ์ง„ํ–‰.


coursera <intro to Tensorflow for AI, ML and DL>

'Deep Learning > Experiments' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

๐Ÿ‘”Improving fashionMNIST Classifier using Convolutions  (1) 2024.06.02
๐Ÿ‘” fashionMNIST Classifier  (0) 2024.01.15

๋Œ“๊ธ€