Deep Learning/Experiments

๐Ÿ‘” fashionMNIST Classifier

metamong 2024. 1. 15.

before building a classification model

โ‘  check the version & load fashionMNIST data & load the training, test split of the fashionMNIST dataset

: fashionMNIST dataset is a collection of grayscale 28x28 pixel clothing images. Each image is associated with a label as shown in this table

(left) label with descriptions / (right) label 915 image

import tensorflow as tf

print(tf.__version__) #2.15.0

# Load the Fashion MNIST dataset
fmnist = tf.keras.datasets.fashion_mnist

# Load the training and test split of the Fashion MNIST dataset
(training_images, training_labels), (test_images, test_labels) = fmnist.load_data()

#60_000 training images
#10_000 testing images

: the data for a particular image is a grid of values from zero to 255 with pixel grayscale values.

 

โ€ป why the dataset is split into two: training & testing? A: The idea is to have 1 set of data for training, and then another set of data that the model hasn't yet seen. This will be used to evaluate how good it would be at classifying values.

 

โ‘ก print the training image (both as an image & a numpy array) / training label (ex: label 915)

import numpy as np
import matplotlib.pyplot as plt

# You can put between 0 to 59999 here
index = 915

# Set number of characters per row when printing
np.set_printoptions(linewidth=320)

# Print the label and image
print(f'LABEL: {training_labels[index]}') #LABEL: 2
print(f'\nIMAGE PIXEL ARRAY:\n {training_images[index]}')
'''
IMAGE PIXEL ARRAY:
 [[  0   0   0   0   0   0   0   0   0  41 140  98  37  33  33  65 138  59   5   0   0   0   0   1   0   0   0   0]
 [  0   0   0   1   0   0  35 116 156 168 148 152 154 134 132 120  88 118 150 136  94  19   0   0   0   0   0   0]
 [  0   0   0   0   0  53 152 134 132 122 136 114  94  92  81  85  98  98 100 116 110 150  21   0   1   0   0   0]
 [  0   0   0   0   0 168 132 134 114 124 122 118 104 106  98  98  92  94 106  92  75 106  83   0   0   0   0   0]
 [  0   0   0   0   0 181 134 136 114 116 108 104  96 100  94  94  92  94 104  92  86  90 118   0   0   0   0   0]
 [  0   0   0   0  19 187 158 126 130 104 112 100 100  94  94  98  88  94  96  83  86  88 126   7   0   0   0   0]
 [  0   0   0   0  53 158 179 116 140  98 108 100 100  94  94  96  90  90  88  77  85  75  94  15   0   0   0   0]
 [  0   0   0   0  90 138 183 142 142 106  94  90  92  92  92  96  88  86  81  67  83  81  98  43   0   0   0   0]
 [  0   0   0   0 102 124 189 175 136 100  85  86  88  90  92  92  88  81  77  65  98  90  96  59   0   0   0   0]
 [  0   0   0   0 106 120 179 197 124  90  88  86  88  85  85  86  88  81  71  63  94  88  96  61   0   0   0   0]
 [  0   0   0   0 110 118 177 223  92  86  96  88  90  86  83  85  92  86  69  59 118  88  94  73   0   0   0   0]
 [  0   0   0   0 116 114 191 243  63  94  98  88  86  88  83  85  98  90  77  59 156  85  90  85   0   0   0   0]
 [  0   0   0   0 122 110 213 255  47  96 104  85  81  86  81  83  90  98  71  59 177  81  96  86   0   0   0   0]
 [  0   0   0   0 126 102 235 245  45 104 106  86  83  83  79  81  92 106  73  63 156  96 104  96   0   0   0   0]
 [  0   0   0   5 128  94 251 233  47 102 110  86  85  81  75  79  88 104  79  63 122 124  81 112   0   0   0   0]
 [  0   0   0  21 138  85 255 225  51 104 110  90  85  79  73  83  92 112  86  37 140 106  75 120   5   0   0   0]
 [  0   0   0  25 154  75 247 225  61 104 112  92  83  77  75  83  92 114  81  49 158  96  94  98  13   0   0   0]
 [  0   0   0  35 156  75 213 223  81 108 122  98  86  81  75  83  90 116  86  53 130 108 104  92  25   0   0   0]
 [  0   0   0  49 142  77 199 219  96 110 124  94  81  73  67  75  88 112  90  47 130 100 104  98  39   0   0   0]
 [  0   0   0  67 130  86 201 217  81  92 106  88  75  73  71  77  75  94  94  53 136  98  86  98  47   0   0   0]
 [  0   0   0  81 136  81 207 217  85 108 112  98  86  88  86  98  98 104 110  67 170 118  65 108  63   0   0   0]
 [  0   0   0  71 154  63 197 199 104 122 118 106  96  90  85  96 110 104 102  73 128 130  73 108  69   0   0   0]
 [  0   0   0  90 154  49 185 177 130 118 120 112 104 102  92 100 118 112 108  83 124 154  71 116  53   0   0   0]
 [  0   0   0 102 150  55 175   0   0   0   0   0   0   0   0   0   0   0   0   0   0 221  81 102  59   0   0   0]
 [  0   0   0 104 150  79 122   0   0   0   0   0   0   0   0   0   0   0   0   0   0 166 120  86  67   0   0   0]
 [  0   0   0 106 132  96  85   0   0   1   0   0   0   1   1   1   1   0   1   0   0 144 114 102  71   0   0   0]
 [  0   0   0  94 146 102  71   0   0   1   0   0   0   0   0   0   0   0   0   0   0 134 124  98  57   0   0   0]
 [  0   0   0  53 142 102  37   0   1   0   0   0   0   0   0   0   0   0   0   0   0  98 128 104  43   0   0   0]]
 '''

# Visualize the image
plt.imshow(training_images[index])

 

โ‘ข normalization

: all of the values in the number are between 0 and 255. If you are training a neural network especially in image processing, for various reasons it will usually learn better if you scale all values to between 0 and 1. It's a process called normalization and fortunately in Python, it's easy to normalize an array without looping.

# Normalize the pixel values of the train and test images
training_images  = training_images / 255.0
test_images = test_images / 255.0

building a classification model

โ‘ฃ building the classification model

# Build the classification model
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), 
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu), 
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

 

(1) Sequential: defines a sequence of layers in the neural network.

 

(2) Flatten: Flatten just takes 28x28 pixel matrix and turns it into a 1-dimensional array.

 

(3) Dense: Adds a layer of neurons. Each layer of neurons need an activation function to tell them what to do. There are a lot of options

ReLU only passes values greater than 0 to the next layer in the network

Softmax takes a list of values and scales these so the sum of all elements will be equal to 1. When applied to model outputs, you can think of the scaled values as the probability for that class. For example, in your classification model which has 10 units in the output dense layer, having the highest value at index = 4 means that the model is most confident that the input clothing image is a coat. If it is at index = 5, then it is a sandal, and so forth. See the short code block below which demonstrates these concepts. You can also watch this lecture if you want to know more about the Softmax function and how the values are computed.

 

ex) softmax function demonstration

# Declare sample inputs and convert to a tensor
inputs = np.array([[1.0, 3.0, 4.0, 2.0]])
inputs = tf.convert_to_tensor(inputs)
print(f'input to softmax function: {inputs.numpy()}')

# Feed the inputs to a softmax activation function
outputs = tf.keras.activations.softmax(inputs)
print(f'output of softmax function: {outputs.numpy()}')

# Get the sum of all values after the softmax
sum = tf.reduce_sum(outputs)
print(f'sum of outputs: {sum}')

# Get the index with highest value
prediction = np.argmax(outputs)
print(f'class with highest probability: {prediction}')

'''
input to softmax function: [[1. 3. 4. 2.]]
output of softmax function: [[0.0320586  0.23688282 0.64391426 0.08714432]]
sum of outputs: 1.0
class with highest probability: 2
'''

: 4๊ฐ€ ์žˆ๋Š” index 2๊ฐ€ ๊ฐ€์žฅ value๊ฐ€ ํฌ๋ฏ€๋กœ softmax function output ๊ฒฐ๊ณผ 0.64๋กœ ์—ญ์‹œ ๊ฐ€์žฅ ํฐ ๋น„์ค‘์„ ์ฐจ์ง€. softmax function ๊ฒฐ๊ณผ๋กœ ๋‚˜์˜จ ๋ชจ๋“  output์˜ ํ•ฉ์€ 1์ด๋ฏ€๋กœ reduce_sum() ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด 1์ด ๋‚˜์™”์Œ ํ™•์ธ ๊ฐ€๋Šฅ. np์˜ argmax() ํ•จ์ˆ˜๋กœ ๊ฐ€์žฅ ํฐ element์˜ index return. 

 

โ‘ค building & fitting training data

: actually building the model. you do this by compiling it with an optimizer and loss function as before -- and then you train it by calling model.fit() asking it to fit your training data to your training labels. It will figure out the relationship between the training data and its actual labels so in the future if you have inputs that looks like the training data, then it can predict what the label for that input is.

model.compile(optimizer = tf.optimizers.Adam(),
              loss = 'sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

'''
1875/1875 [==============================] - 3s 2ms/step - loss: 0.5013 - accuracy: 0.8230
Epoch 2/5
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3780 - accuracy: 0.8628
Epoch 3/5
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3374 - accuracy: 0.8757
Epoch 4/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3156 - accuracy: 0.8844
Epoch 5/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2969 - accuracy: 0.8895
'''

: Once it's done training -- you should see an accuracy value at the end of the final epoch. It might look something like 0.8895. This tells you that your neural network is about 89% accurate in classifying the training data. That is, it figured out a pattern match between the image and the labels that worked 89% of the time. Not great, but not bad considering it was only trained for 5 epochs and done quite quickly.

evaluating & predicting

โ‘ฅ evaluation

: But how would it work with unseen data? That's why we have the test images and labels. We can call model.evaluate() with this test dataset as inputs and it will report back the loss and accuracy of the model.

# Evaluate the model on unseen data
model.evaluate(test_images, test_labels)

#313/313 [==============================] - 1s 1ms/step - loss: 0.3436 - accuracy: 0.8794
#[0.34361031651496887, 0.8794000148773193]

: You can expect the accuracy here to be about 0.8794which means it was 87.94% accurate on the entire test set. As expected, it probably would not do as well withunseen data as it did with data it was trained on!

 

โ‘ฆ prdicting an image with index 915 (with the model we built above)

classifications = model.predict(test_images)

print(classifications)
print(classifications[915])

prediction = np.argmax(classifications[915])
print(f'class with highest probability: {prediction}')

'''
313/313 [==============================] - 0s 1ms/step
[[6.2409208e-06 9.0771358e-08 2.1576885e-05 ... 2.1569857e-02 8.9094639e-05 9.7474480e-01]
 [8.7240205e-06 5.8890087e-11 9.9830294e-01 ... 1.2497652e-17 9.7774171e-09 4.2074684e-17]
 [8.0920754e-06 9.9998975e-01 1.2109047e-07 ... 7.6645480e-18 2.9345939e-09 4.4589631e-15]
 ...
 [2.6315968e-03 6.0284986e-09 1.1105484e-03 ... 1.5767324e-07 9.9230075e-01 8.9485230e-10]
 [4.3416367e-06 9.9987364e-01 7.8685781e-07 ... 1.3830411e-12 1.4922641e-08 2.8705447e-08]
 [1.2723396e-04 5.3855490e-07 2.4892579e-04 ... 2.0233687e-02 3.0123170e-03 4.5927553e-04]]
[6.4266347e-03 1.8270270e-05 9.8238903e-01 4.2550218e-06 8.0206711e-03 3.6529058e-09 3.1218783e-03 1.9612825e-11 1.9301558e-05 7.1107530e-11]
class with highest probability: 2
'''

 

: ์‹ค์ œ index 915์˜ label์„ ์ถœ๋ ฅํ•ด๋ณด๋ฉด 2๊ฐ€ ๋‚˜์˜ค๋ฏ€๋กœ ๋งŒ๋“  ๋ชจ๋ธ์ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ 915 index image๋ฅผ class 2๋กœ ์˜ˆ์ธกํ–ˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. class 2๋Š” index 2๋ฅผ ๋œปํ•˜๋ฏ€๋กœ ์œ„ table์— ์˜ํ•ด Pullover๋กœ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์˜ˆ์ธกํ–ˆ์Œ.

print(test_labels[915]) #2

tuning a model

โ‘ง 1024 neurons (512 neurons ์ƒ๋žต)

: Experiment with different values for the dense layer with 1024 neurons

fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
             metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.4718 - accuracy: 0.8307
Epoch 2/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3587 - accuracy: 0.8682
Epoch 3/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3219 - accuracy: 0.8823
Epoch 4/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2964 - accuracy: 0.8897
Epoch 5/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2779 - accuracy: 0.8960
313/313 [==============================] - 1s 2ms/step - loss: 0.3408 - accuracy: 0.8801
'''

: accuracy๊ฐ€ 0.8801๋กœ ์•ž์„  0.8794 ์ •ํ™•๋„๋ณด๋‹ค ์†Œํญ ์ฆ๊ฐ€. 

 

โ‘จ removing Flatten layer / differing the number of ouput nodes

: Flatten ๊ณผ์ •์„ ๊ฑฐ์ณ์•ผ๋งŒ neural network์— ์•Œ๋งž์€ shape์œผ๋กœ ํ›ˆ๋ จ์ด ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์—๋Ÿฌ ๋ฐœ์ƒ. ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ neural network์— ์•Œ๋งž์€ ๊ฐœ์ˆ˜์˜ ๊ฒฐ๊ณผ๋กœ output์ด ๋‚˜์™€์•ผ ํ•˜๋ฏ€๋กœ ๋˜‘๊ฐ™์ด ์—๋Ÿฌ ๋ฐœ์ƒ

 

โ‘ฉ adding another layer btw the one with 512 and the final layer with 10

fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(256, activation=tf.nn.relu), #added a layer
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
                                  ])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.4679 - accuracy: 0.8304
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3581 - accuracy: 0.8685
Epoch 3/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3174 - accuracy: 0.8830
Epoch 4/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2970 - accuracy: 0.8889
Epoch 5/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2783 - accuracy: 0.8962
313/313 [==============================] - 1s 2ms/step - loss: 0.3441 - accuracy: 0.8771
'''

: ์ค‘๊ฐ„์— hidden layer๋ฅผ ํ•œ ๊ฐœ ๋” ์ถ”๊ฐ€ํ•œ ๊ฒฐ๊ณผ accuracy๊ฐ€ 0.8771๋กœ ์œ ์˜๋ฏธํ•œ ๋ณ€ํ™”๋Š” ์—†๋‹ค.

 

โ‘ช more or less epochs

→ 15 epochs) 0.8801๋กœ ์œ ์˜๋ฏธํ•œ ๋ณ€ํ™”๋Š” ์—†๋‹ค.

#15 epochs
fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=15) # Experiment with the number of epochs

model.evaluate(test_images, test_labels)

'''
Epoch 1/15
1875/1875 [==============================] - 3s 2ms/step - loss: 0.4965 - accuracy: 0.8239
Epoch 2/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3757 - accuracy: 0.8649
Epoch 3/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3378 - accuracy: 0.8777
Epoch 4/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3154 - accuracy: 0.8850
Epoch 5/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2956 - accuracy: 0.8909
Epoch 6/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2813 - accuracy: 0.8957
Epoch 7/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2677 - accuracy: 0.8999
Epoch 8/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2569 - accuracy: 0.9051
Epoch 9/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2470 - accuracy: 0.9079
Epoch 10/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2401 - accuracy: 0.9116
Epoch 11/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2324 - accuracy: 0.9133
Epoch 12/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2243 - accuracy: 0.9163
Epoch 13/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2170 - accuracy: 0.9189
Epoch 14/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2111 - accuracy: 0.9222
Epoch 15/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2054 - accuracy: 0.9239
313/313 [==============================] - 0s 1ms/step - loss: 0.3568 - accuracy: 0.8801
'''

 

→ 30 epochs) 0.8853์œผ๋กœ ๊ทธ๋ž˜๋„ accuracy๊ฐ€ ๊ฐ€์žฅ ๋†’๊ฒŒ ์ธก์ •๋จ.

#30 epochs
fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=30) # Experiment with the number of epochs

model.evaluate(test_images, test_labels)

'''
Epoch 1/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.5035 - accuracy: 0.8238
Epoch 2/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3792 - accuracy: 0.8635
Epoch 3/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3402 - accuracy: 0.8756
Epoch 4/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3129 - accuracy: 0.8851
Epoch 5/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2930 - accuracy: 0.8914
Epoch 6/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2790 - accuracy: 0.8971
Epoch 7/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2679 - accuracy: 0.9003
Epoch 8/30
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2568 - accuracy: 0.9043
Epoch 9/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2480 - accuracy: 0.9075
Epoch 10/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2375 - accuracy: 0.9110
Epoch 11/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2300 - accuracy: 0.9143
Epoch 12/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2231 - accuracy: 0.9155
Epoch 13/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2172 - accuracy: 0.9201
Epoch 14/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2094 - accuracy: 0.9205
Epoch 15/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2036 - accuracy: 0.9244
Epoch 16/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1980 - accuracy: 0.9265
Epoch 17/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1929 - accuracy: 0.9275
Epoch 18/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1867 - accuracy: 0.9301
Epoch 19/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1830 - accuracy: 0.9319
Epoch 20/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1767 - accuracy: 0.9337
Epoch 21/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1741 - accuracy: 0.9344
Epoch 22/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1680 - accuracy: 0.9373
Epoch 23/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1667 - accuracy: 0.9375
Epoch 24/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1616 - accuracy: 0.9383
Epoch 25/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1572 - accuracy: 0.9409
Epoch 26/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1561 - accuracy: 0.9405
Epoch 27/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1505 - accuracy: 0.9425
Epoch 28/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1490 - accuracy: 0.9432
Epoch 29/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1453 - accuracy: 0.9443
Epoch 30/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1415 - accuracy: 0.9460
313/313 [==============================] - 1s 1ms/step - loss: 0.4024 - accuracy: 0.8853
'''

 

โ‘ซ without normalization

: normalization์„ ๊ฑฐ์น˜๋Š” ๊ณผ์ • ์—†์ด model์„ ๋Œ๋ ธ๋‹ค๋ฉด 0.8366์œผ๋กœ ์ •ํ™•๋„๊ฐ€ normalization ํ–ˆ์„ ๋•Œ์— ๋น„ํ•ด ๋งค์šฐ ๋–จ์–ด์ง„๋‹ค. ๊ทธ ์ด์œ ๋Š”

 

(1) slower convergence: Neural networks often converge faster when input data is normalized. This is because having features on similar scales helps the optimization algorithm find the minimum more efficiently. ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์†๋„๋ฅผ ๋Šฆ์ถฐ์ค€๋‹ค.

 

(2) weight sensitivity: Neural networks might become sensitive to the scale of input features. Without normalization, some weights may become much larger or smaller than others, potentially leading to numerical instability. scale์— ์˜ํ–ฅ์„ ๋ฐ›๊ฒŒ ๋˜์–ด ๋ชจ๋“  feature๊ฐ€ model prediction์— ๊ณตํ‰ํ•˜๊ฒŒ ๋ฐ˜์˜๋˜์ง€ ์•Š๋Š”๋‹ค.

 

(3) model generalization: Normalization can help improve the generalization of the model to new, unseen data. If the model is trained without normalization, it might not perform as well on data outside the training set.

fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

# training_images=training_images/255.0 # Experiment with removing this line
# test_images=test_images/255.0 # Experiment with removing this line
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)

'''
Epoch 1/5
1875/1875 [==============================] - 5s 3ms/step - loss: 4.6223 - accuracy: 0.7620
Epoch 2/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5465 - accuracy: 0.8155
Epoch 3/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5117 - accuracy: 0.8235
Epoch 4/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5009 - accuracy: 0.8284
Epoch 5/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.4882 - accuracy: 0.8352
313/313 [==============================] - 1s 2ms/step - loss: 0.5133 - accuracy: 0.8366
'''

 

โ‘ฌ callback

: training ๋„์ค‘ ์›ํ•˜๋Š” accuracy๊ฐ€ ๋„๋‹ฌํ•ด์„œ ๋” ์ด์ƒ training์ด ํ•„์š”์—†๋‹ค๊ณ  ํŒ๋‹จ๋  ๋•Œ ํŒ๋‹จ์„ ์™„๋ฃŒํ•˜๊ณ  prediction ์ง„ํ–‰

ex) ์•ž์„  30 epochs์—์„œ ๋„์ค‘์˜ accuracy๊ฐ€ 0.92 ์ด์ƒ์ผ ๊ฒฝ์šฐ ๋” ์ด์ƒ epoch๋ฅผ ์ง„ํ–‰ํ•˜์ง€ ์•Š๊ณ  ๋ฐ”๋กœ ์™„๋ฃŒํ•œ ๋’ค prediction ์ง„ํ–‰. ๊ทธ ๊ฒฐ๊ณผ, 30 epochs ์ค‘ 12 epoch๋งŒ ์ง„ํ–‰ํ–ˆ์œผ๋ฉฐ 30 epochs ์ง„ํ–‰ํ–ˆ์„ ๋•Œ์˜ 0.8853 accuracy๋ณด๋‹ค ๋” ํ–ฅ์ƒ๋œ 0.8927 accuracy ์‚ฐ์ถœ.

class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('accuracy') >= 0.92): # Experiment with changing this value
      print("\nReached 60% accuracy so cancelling training!")
      self.model.stop_training = True

callbacks = myCallback()

fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images=training_images/255.0
test_images=test_images/255.0
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=30, callbacks=[callbacks])
model.evaluate(test_images, test_labels)

'''
Epoch 1/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.4728 - accuracy: 0.8321
Epoch 2/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3611 - accuracy: 0.8662
Epoch 3/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3243 - accuracy: 0.8799
Epoch 4/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2987 - accuracy: 0.8894
Epoch 5/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2807 - accuracy: 0.8955
Epoch 6/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2644 - accuracy: 0.9016
Epoch 7/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2519 - accuracy: 0.9053
Epoch 8/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2413 - accuracy: 0.9112
Epoch 9/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2297 - accuracy: 0.9141
Epoch 10/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2232 - accuracy: 0.9171
Epoch 11/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2118 - accuracy: 0.9198
Epoch 12/30
1873/1875 [============================>.] - ETA: 0s - loss: 0.2030 - accuracy: 0.9231
Reached 60% accuracy so cancelling training!
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2030 - accuracy: 0.9231
313/313 [==============================] - 1s 2ms/step - loss: 0.3272 - accuracy: 0.8927
'''

accuracy analysis

โ‘ญ fashionMNIST Classifier ๋ชจ๋ธ test image accuracy ์„ฑ๋Šฅ ์ •๋ฆฌ

โ€ป ์•„๋ž˜ ๊ธฐ์ค€ ์™ธ ๋‚˜๋จธ์ง€ ์ธ์ž๋Š” ๋ชจ๋‘ ๋™์ผํ•˜๊ฒŒ ์„ค์ • โ€ป

 

โ˜… epochs 15 / 30์œผ๋กœ ์„ค์ •ํ•œ case๊ฐ€ epochs 5๋กœ ์„ค์ •ํ•œ case๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋†’๊ฒŒ ๋‚˜์˜ด์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

โ˜… normalization์„ ์„ค์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ์„ฑ๋Šฅ์ด ๋“œ๋ผ๋งˆํ‹ฑํ•˜๊ฒŒ ๊ฐ์†Œํ–ˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

โ˜… ๋™์ผํ•œ ์กฐ๊ฑด์— hidden layer์„ ์ถ”๊ฐ€ํ•œ ๊ฒฝ์šฐ ์„ฑ๋Šฅ์ด 0.8747์—์„œ 0.8771๋กœ ์†Œํญ ์ฆ๊ฐ€ํ–ˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

โ˜… callback์„ ์„ค์ •ํ•˜์—ฌ trainingํ•œ ๊ฒฐ๊ณผ ์„ฑ๋Šฅ์ด 0.92 ์ด์ƒ์ด๋ฉด ์ค‘๋‹จํ–ˆ์„ ๊ฒฝ์šฐ 12 epochs๋งŒ ์ง„ํ–‰ํ–ˆ๊ณ , ์ „์ฒด ๋ชจ๋“  ์‹คํ—˜ ์ค‘ ๊ฐ€์žฅ ์„ฑ๋Šฅ์ด ๋†’๊ฒŒ ๋‚˜์™”๋‹ค. ๋™์ผํ•œ ์กฐ๊ฑด์— callback์„ ์ง„ํ–‰ํ•˜์ง€ ์•Š์€ (์œ„์—์„œ) 6๋ฒˆ์งธ ์‹คํ—˜๊ณผ ๋น„๊ตํ•˜๋ฉด, 30 epochs ์ „์ฒด๋ฅผ ์ง„ํ–‰ํ–ˆ์„ ๋•Œ๋ณด๋‹ค callback ๊ธฐ๋Šฅ ์„ค์ •์œผ๋กœ 12 epochs์— ๋ฉˆ์ถ˜ ๊ฒฐ๊ณผ๊ฐ€ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์ด ๋†’์•˜๋‹ค. ์ด๋Š”, 30 epochs๊นŒ์ง€ ๋Œ๋ฆฐ case๊ฐ€ ์˜ค๋ฒ„ํ”ผํŒ…์˜ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Œ์„ ์œ ์ถ”ํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค.


Coursera <Intro to Tensorflow for AI, ML, and DL>

 

๋Œ“๊ธ€