Deep Learning/Experiments

👔 fashionMNIST Classifier

metamong 2024. 1. 15.

before building a classification model

① check the version & load fashionMNIST data & load the training, test split of the fashionMNIST dataset

: fashionMNIST dataset is a collection of grayscale 28x28 pixel clothing images. Each image is associated with a label as shown in this table

(left) label with descriptions / (right) label 915 image

import tensorflow as tf

print(tf.__version__) #2.15.0

# Load the Fashion MNIST dataset
fmnist = tf.keras.datasets.fashion_mnist

# Load the training and test split of the Fashion MNIST dataset
(training_images, training_labels), (test_images, test_labels) = fmnist.load_data()

#60_000 training images
#10_000 testing images

: the data for a particular image is a grid of values from zero to 255 with pixel grayscale values.

※ why the dataset is split into two: training & testing? A: The idea is to have 1 set of data for training, and then another set of data that the model hasn't yet seen. This will be used to evaluate how good it would be at classifying values.

② print the training image (both as an image & a numpy array) / training label (ex: label 915)

import numpy as np
import matplotlib.pyplot as plt

# You can put between 0 to 59999 here
index = 915

# Set number of characters per row when printing
np.set_printoptions(linewidth=320)

# Print the label and image
print(f'LABEL: {training_labels[index]}') #LABEL: 2
print(f'\nIMAGE PIXEL ARRAY:\n {training_images[index]}')
'''
IMAGE PIXEL ARRAY:
 [[  0   0   0   0   0   0   0   0   0  41 140  98  37  33  33  65 138  59   5   0   0   0   0   1   0   0   0   0]
 [  0   0   0   1   0   0  35 116 156 168 148 152 154 134 132 120  88 118 150 136  94  19   0   0   0   0   0   0]
 [  0   0   0   0   0  53 152 134 132 122 136 114  94  92  81  85  98  98 100 116 110 150  21   0   1   0   0   0]
 [  0   0   0   0   0 168 132 134 114 124 122 118 104 106  98  98  92  94 106  92  75 106  83   0   0   0   0   0]
 [  0   0   0   0   0 181 134 136 114 116 108 104  96 100  94  94  92  94 104  92  86  90 118   0   0   0   0   0]
 [  0   0   0   0  19 187 158 126 130 104 112 100 100  94  94  98  88  94  96  83  86  88 126   7   0   0   0   0]
 [  0   0   0   0  53 158 179 116 140  98 108 100 100  94  94  96  90  90  88  77  85  75  94  15   0   0   0   0]
 [  0   0   0   0  90 138 183 142 142 106  94  90  92  92  92  96  88  86  81  67  83  81  98  43   0   0   0   0]
 [  0   0   0   0 102 124 189 175 136 100  85  86  88  90  92  92  88  81  77  65  98  90  96  59   0   0   0   0]
 [  0   0   0   0 106 120 179 197 124  90  88  86  88  85  85  86  88  81  71  63  94  88  96  61   0   0   0   0]
 [  0   0   0   0 110 118 177 223  92  86  96  88  90  86  83  85  92  86  69  59 118  88  94  73   0   0   0   0]
 [  0   0   0   0 116 114 191 243  63  94  98  88  86  88  83  85  98  90  77  59 156  85  90  85   0   0   0   0]
 [  0   0   0   0 122 110 213 255  47  96 104  85  81  86  81  83  90  98  71  59 177  81  96  86   0   0   0   0]
 [  0   0   0   0 126 102 235 245  45 104 106  86  83  83  79  81  92 106  73  63 156  96 104  96   0   0   0   0]
 [  0   0   0   5 128  94 251 233  47 102 110  86  85  81  75  79  88 104  79  63 122 124  81 112   0   0   0   0]
 [  0   0   0  21 138  85 255 225  51 104 110  90  85  79  73  83  92 112  86  37 140 106  75 120   5   0   0   0]
 [  0   0   0  25 154  75 247 225  61 104 112  92  83  77  75  83  92 114  81  49 158  96  94  98  13   0   0   0]
 [  0   0   0  35 156  75 213 223  81 108 122  98  86  81  75  83  90 116  86  53 130 108 104  92  25   0   0   0]
 [  0   0   0  49 142  77 199 219  96 110 124  94  81  73  67  75  88 112  90  47 130 100 104  98  39   0   0   0]
 [  0   0   0  67 130  86 201 217  81  92 106  88  75  73  71  77  75  94  94  53 136  98  86  98  47   0   0   0]
 [  0   0   0  81 136  81 207 217  85 108 112  98  86  88  86  98  98 104 110  67 170 118  65 108  63   0   0   0]
 [  0   0   0  71 154  63 197 199 104 122 118 106  96  90  85  96 110 104 102  73 128 130  73 108  69   0   0   0]
 [  0   0   0  90 154  49 185 177 130 118 120 112 104 102  92 100 118 112 108  83 124 154  71 116  53   0   0   0]
 [  0   0   0 102 150  55 175   0   0   0   0   0   0   0   0   0   0   0   0   0   0 221  81 102  59   0   0   0]
 [  0   0   0 104 150  79 122   0   0   0   0   0   0   0   0   0   0   0   0   0   0 166 120  86  67   0   0   0]
 [  0   0   0 106 132  96  85   0   0   1   0   0   0   1   1   1   1   0   1   0   0 144 114 102  71   0   0   0]
 [  0   0   0  94 146 102  71   0   0   1   0   0   0   0   0   0   0   0   0   0   0 134 124  98  57   0   0   0]
 [  0   0   0  53 142 102  37   0   1   0   0   0   0   0   0   0   0   0   0   0   0  98 128 104  43   0   0   0]]
 '''

# Visualize the image
plt.imshow(training_images[index])

③ normalization

: all of the values in the number are between 0 and 255. If you are training a neural network especially in image processing, for various reasons it will usually learn better if you scale all values to between 0 and 1. It's a process called normalization and fortunately in Python, it's easy to normalize an array without looping.

# Normalize the pixel values of the train and test images
training_images  = training_images / 255.0
test_images = test_images / 255.0

building a classification model

④ building the classification model

# Build the classification model
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), 
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu), 
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

(1) Sequential: defines a sequence of layers in the neural network.

(2) Flatten: Flatten just takes 28x28 pixel matrix and turns it into a 1-dimensional array.

(3) Dense: Adds a layer of neurons. Each layer of neurons need an activation function to tell them what to do. There are a lot of options

→ ReLU only passes values greater than 0 to the next layer in the network

→ Softmax takes a list of values and scales these so the sum of all elements will be equal to 1. When applied to model outputs, you can think of the scaled values as the probability for that class. For example, in your classification model which has 10 units in the output dense layer, having the highest value at index = 4 means that the model is most confident that the input clothing image is a coat. If it is at index = 5, then it is a sandal, and so forth. See the short code block below which demonstrates these concepts. You can also watch this lecture if you want to know more about the Softmax function and how the values are computed.

ex) softmax function demonstration

# Declare sample inputs and convert to a tensor
inputs = np.array([[1.0, 3.0, 4.0, 2.0]])
inputs = tf.convert_to_tensor(inputs)
print(f'input to softmax function: {inputs.numpy()}')

# Feed the inputs to a softmax activation function
outputs = tf.keras.activations.softmax(inputs)
print(f'output of softmax function: {outputs.numpy()}')

# Get the sum of all values after the softmax
sum = tf.reduce_sum(outputs)
print(f'sum of outputs: {sum}')

# Get the index with highest value
prediction = np.argmax(outputs)
print(f'class with highest probability: {prediction}')

'''
input to softmax function: [[1. 3. 4. 2.]]
output of softmax function: [[0.0320586  0.23688282 0.64391426 0.08714432]]
sum of outputs: 1.0
class with highest probability: 2
'''

: 4가 있는 index 2가 가장 value가 크므로 softmax function output 결과 0.64로 역시 가장 큰 비중을 차지. softmax function 결과로 나온 모든 output의 합은 1이므로 reduce_sum() 함수를 통해 1이 나왔음 확인 가능. np의 argmax() 함수로 가장 큰 element의 index return.

⑤ building & fitting training data

: actually building the model. you do this by compiling it with an optimizer and loss function as before -- and then you train it by calling model.fit() asking it to fit your training data to your training labels. It will figure out the relationship between the training data and its actual labels so in the future if you have inputs that looks like the training data, then it can predict what the label for that input is.

model.compile(optimizer = tf.optimizers.Adam(),
              loss = 'sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

'''
1875/1875 [==============================] - 3s 2ms/step - loss: 0.5013 - accuracy: 0.8230
Epoch 2/5
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3780 - accuracy: 0.8628
Epoch 3/5
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3374 - accuracy: 0.8757
Epoch 4/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3156 - accuracy: 0.8844
Epoch 5/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2969 - accuracy: 0.8895
'''

: Once it's done training -- you should see an accuracy value at the end of the final epoch. It might look something like 0.8895. This tells you that your neural network is about 89% accurate in classifying the training data. That is, it figured out a pattern match between the image and the labels that worked 89% of the time. Not great, but not bad considering it was only trained for 5 epochs and done quite quickly.

evaluating & predicting

⑥ evaluation

: But how would it work with unseen data? That's why we have the test images and labels. We can call model.evaluate() with this test dataset as inputs and it will report back the loss and accuracy of the model.

# Evaluate the model on unseen data
model.evaluate(test_images, test_labels)

#313/313 [==============================] - 1s 1ms/step - loss: 0.3436 - accuracy: 0.8794
#[0.34361031651496887, 0.8794000148773193]

: You can expect the accuracy here to be about 0.8794which means it was 87.94% accurate on the entire test set. As expected, it probably would not do as well withunseen data as it did with data it was trained on!

⑦ prdicting an image with index 915 (with the model we built above)

classifications = model.predict(test_images)

print(classifications)
print(classifications[915])

prediction = np.argmax(classifications[915])
print(f'class with highest probability: {prediction}')

'''
313/313 [==============================] - 0s 1ms/step
[[6.2409208e-06 9.0771358e-08 2.1576885e-05 ... 2.1569857e-02 8.9094639e-05 9.7474480e-01]
 [8.7240205e-06 5.8890087e-11 9.9830294e-01 ... 1.2497652e-17 9.7774171e-09 4.2074684e-17]
 [8.0920754e-06 9.9998975e-01 1.2109047e-07 ... 7.6645480e-18 2.9345939e-09 4.4589631e-15]
 ...
 [2.6315968e-03 6.0284986e-09 1.1105484e-03 ... 1.5767324e-07 9.9230075e-01 8.9485230e-10]
 [4.3416367e-06 9.9987364e-01 7.8685781e-07 ... 1.3830411e-12 1.4922641e-08 2.8705447e-08]
 [1.2723396e-04 5.3855490e-07 2.4892579e-04 ... 2.0233687e-02 3.0123170e-03 4.5927553e-04]]
[6.4266347e-03 1.8270270e-05 9.8238903e-01 4.2550218e-06 8.0206711e-03 3.6529058e-09 3.1218783e-03 1.9612825e-11 1.9301558e-05 7.1107530e-11]
class with highest probability: 2
'''

: 실제 index 915의 label을 출력해보면 2가 나오므로 만든 모델이 올바르게 915 index image를 class 2로 예측했음을 알 수 있다. class 2는 index 2를 뜻하므로 위 table에 의해 Pullover로 올바르게 예측했음.

print(test_labels[915]) #2

tuning a model

⑧ 1024 neurons (512 neurons 생략)

: Experiment with different values for the dense layer with 1024 neurons

fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
             metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.4718 - accuracy: 0.8307
Epoch 2/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3587 - accuracy: 0.8682
Epoch 3/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3219 - accuracy: 0.8823
Epoch 4/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2964 - accuracy: 0.8897
Epoch 5/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2779 - accuracy: 0.8960
313/313 [==============================] - 1s 2ms/step - loss: 0.3408 - accuracy: 0.8801
'''

: accuracy가 0.8801로 앞선 0.8794 정확도보다 소폭 증가.

⑨ removing Flatten layer / differing the number of ouput nodes

: Flatten 과정을 거쳐야만 neural network에 알맞은 shape으로 훈련이 가능하므로 에러 발생. 마찬가지로 neural network에 알맞은 개수의 결과로 output이 나와야 하므로 똑같이 에러 발생

⑩ adding another layer btw the one with 512 and the final layer with 10

fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(256, activation=tf.nn.relu), #added a layer
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
                                  ])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.4679 - accuracy: 0.8304
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3581 - accuracy: 0.8685
Epoch 3/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3174 - accuracy: 0.8830
Epoch 4/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2970 - accuracy: 0.8889
Epoch 5/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2783 - accuracy: 0.8962
313/313 [==============================] - 1s 2ms/step - loss: 0.3441 - accuracy: 0.8771
'''

: 중간에 hidden layer를 한 개 더 추가한 결과 accuracy가 0.8771로 유의미한 변화는 없다.

⑪ more or less epochs

→ 15 epochs) 0.8801로 유의미한 변화는 없다.

#15 epochs
fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=15) # Experiment with the number of epochs

model.evaluate(test_images, test_labels)

'''
Epoch 1/15
1875/1875 [==============================] - 3s 2ms/step - loss: 0.4965 - accuracy: 0.8239
Epoch 2/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3757 - accuracy: 0.8649
Epoch 3/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3378 - accuracy: 0.8777
Epoch 4/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3154 - accuracy: 0.8850
Epoch 5/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2956 - accuracy: 0.8909
Epoch 6/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2813 - accuracy: 0.8957
Epoch 7/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2677 - accuracy: 0.8999
Epoch 8/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2569 - accuracy: 0.9051
Epoch 9/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2470 - accuracy: 0.9079
Epoch 10/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2401 - accuracy: 0.9116
Epoch 11/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2324 - accuracy: 0.9133
Epoch 12/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2243 - accuracy: 0.9163
Epoch 13/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2170 - accuracy: 0.9189
Epoch 14/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2111 - accuracy: 0.9222
Epoch 15/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2054 - accuracy: 0.9239
313/313 [==============================] - 0s 1ms/step - loss: 0.3568 - accuracy: 0.8801
'''

→ 30 epochs) 0.8853으로 그래도 accuracy가 가장 높게 측정됨.

#30 epochs
fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=30) # Experiment with the number of epochs

model.evaluate(test_images, test_labels)

'''
Epoch 1/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.5035 - accuracy: 0.8238
Epoch 2/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3792 - accuracy: 0.8635
Epoch 3/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3402 - accuracy: 0.8756
Epoch 4/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3129 - accuracy: 0.8851
Epoch 5/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2930 - accuracy: 0.8914
Epoch 6/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2790 - accuracy: 0.8971
Epoch 7/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2679 - accuracy: 0.9003
Epoch 8/30
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2568 - accuracy: 0.9043
Epoch 9/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2480 - accuracy: 0.9075
Epoch 10/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2375 - accuracy: 0.9110
Epoch 11/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2300 - accuracy: 0.9143
Epoch 12/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2231 - accuracy: 0.9155
Epoch 13/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2172 - accuracy: 0.9201
Epoch 14/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2094 - accuracy: 0.9205
Epoch 15/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2036 - accuracy: 0.9244
Epoch 16/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1980 - accuracy: 0.9265
Epoch 17/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1929 - accuracy: 0.9275
Epoch 18/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1867 - accuracy: 0.9301
Epoch 19/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1830 - accuracy: 0.9319
Epoch 20/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1767 - accuracy: 0.9337
Epoch 21/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1741 - accuracy: 0.9344
Epoch 22/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1680 - accuracy: 0.9373
Epoch 23/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1667 - accuracy: 0.9375
Epoch 24/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1616 - accuracy: 0.9383
Epoch 25/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1572 - accuracy: 0.9409
Epoch 26/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1561 - accuracy: 0.9405
Epoch 27/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1505 - accuracy: 0.9425
Epoch 28/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1490 - accuracy: 0.9432
Epoch 29/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1453 - accuracy: 0.9443
Epoch 30/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1415 - accuracy: 0.9460
313/313 [==============================] - 1s 1ms/step - loss: 0.4024 - accuracy: 0.8853
'''

⑫ without normalization

: normalization을 거치는 과정 없이 model을 돌렸다면 0.8366으로 정확도가 normalization 했을 때에 비해 매우 떨어진다. 그 이유는

(1) slower convergence: Neural networks often converge faster when input data is normalized. This is because having features on similar scales helps the optimization algorithm find the minimum more efficiently. 최적화 알고리즘의 속도를 늦춰준다.

(2) weight sensitivity: Neural networks might become sensitive to the scale of input features. Without normalization, some weights may become much larger or smaller than others, potentially leading to numerical instability. scale에 영향을 받게 되어 모든 feature가 model prediction에 공평하게 반영되지 않는다.

(3) model generalization: Normalization can help improve the generalization of the model to new, unseen data. If the model is trained without normalization, it might not perform as well on data outside the training set.

fmnist = tf.keras.datasets.fashion_mnist

(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

# training_images=training_images/255.0 # Experiment with removing this line
# test_images=test_images/255.0 # Experiment with removing this line
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)

'''
Epoch 1/5
1875/1875 [==============================] - 5s 3ms/step - loss: 4.6223 - accuracy: 0.7620
Epoch 2/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5465 - accuracy: 0.8155
Epoch 3/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5117 - accuracy: 0.8235
Epoch 4/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5009 - accuracy: 0.8284
Epoch 5/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.4882 - accuracy: 0.8352
313/313 [==============================] - 1s 2ms/step - loss: 0.5133 - accuracy: 0.8366
'''

⑬ callback

: training 도중 원하는 accuracy가 도달해서 더 이상 training이 필요없다고 판단될 때 판단을 완료하고 prediction 진행

ex) 앞선 30 epochs에서 도중의 accuracy가 0.92 이상일 경우 더 이상 epoch를 진행하지 않고 바로 완료한 뒤 prediction 진행. 그 결과, 30 epochs 중 12 epoch만 진행했으며 30 epochs 진행했을 때의 0.8853 accuracy보다 더 향상된 0.8927 accuracy 산출.

class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('accuracy') >= 0.92): # Experiment with changing this value
      print("\nReached 60% accuracy so cancelling training!")
      self.model.stop_training = True

callbacks = myCallback()

fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()

training_images=training_images/255.0
test_images=test_images/255.0
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=30, callbacks=[callbacks])
model.evaluate(test_images, test_labels)

'''
Epoch 1/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.4728 - accuracy: 0.8321
Epoch 2/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3611 - accuracy: 0.8662
Epoch 3/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3243 - accuracy: 0.8799
Epoch 4/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2987 - accuracy: 0.8894
Epoch 5/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2807 - accuracy: 0.8955
Epoch 6/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2644 - accuracy: 0.9016
Epoch 7/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2519 - accuracy: 0.9053
Epoch 8/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2413 - accuracy: 0.9112
Epoch 9/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2297 - accuracy: 0.9141
Epoch 10/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2232 - accuracy: 0.9171
Epoch 11/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2118 - accuracy: 0.9198
Epoch 12/30
1873/1875 [============================>.] - ETA: 0s - loss: 0.2030 - accuracy: 0.9231
Reached 60% accuracy so cancelling training!
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2030 - accuracy: 0.9231
313/313 [==============================] - 1s 2ms/step - loss: 0.3272 - accuracy: 0.8927
'''

accuracy analysis

⑭ fashionMNIST Classifier 모델 test image accuracy 성능 정리

※ 아래 기준 외 나머지 인자는 모두 동일하게 설정 ※

★ epochs 15 / 30으로 설정한 case가 epochs 5로 설정한 case보다 성능이 높게 나옴을 알 수 있다.

★ normalization을 설정하지 않으면 성능이 드라마틱하게 감소했음을 알 수 있다.

★ 동일한 조건에 hidden layer을 추가한 경우 성능이 0.8747에서 0.8771로 소폭 증가했음을 알 수 있다.

★ callback을 설정하여 training한 결과 성능이 0.92 이상이면 중단했을 경우 12 epochs만 진행했고, 전체 모든 실험 중 가장 성능이 높게 나왔다. 동일한 조건에 callback을 진행하지 않은 (위에서) 6번째 실험과 비교하면, 30 epochs 전체를 진행했을 때보다 callback 기능 설정으로 12 epochs에 멈춘 결과가 일반화 성능이 높았다. 이는, 30 epochs까지 돌린 case가 오버피팅의 가능성이 있음을 유추해볼 수 있다.

Coursera <Intro to Tensorflow for AI, ML, and DL>

저작자표시 비영리 변경금지 (새창열림)

'Deep Learning > Experiments' 카테고리의 다른 글

👔 Tuning fashionMNIST Classifier I: Adjusting the Number of epochs (0)	2024.08.02
👔Improving fashionMNIST Classifier using Convolutions (0)	2024.06.02