before building a classification model
โ check the version & load fashionMNIST data & load the training, test split of the fashionMNIST dataset
: fashionMNIST dataset is a collection of grayscale 28x28 pixel clothing images. Each image is associated with a label as shown in this table
import tensorflow as tf
print(tf.__version__) #2.15.0
# Load the Fashion MNIST dataset
fmnist = tf.keras.datasets.fashion_mnist
# Load the training and test split of the Fashion MNIST dataset
(training_images, training_labels), (test_images, test_labels) = fmnist.load_data()
#60_000 training images
#10_000 testing images
: the data for a particular image is a grid of values from zero to 255 with pixel grayscale values.
โป why the dataset is split into two: training & testing? A: The idea is to have 1 set of data for training, and then another set of data that the model hasn't yet seen. This will be used to evaluate how good it would be at classifying values.
โก print the training image (both as an image & a numpy array) / training label (ex: label 915)
import numpy as np
import matplotlib.pyplot as plt
# You can put between 0 to 59999 here
index = 915
# Set number of characters per row when printing
np.set_printoptions(linewidth=320)
# Print the label and image
print(f'LABEL: {training_labels[index]}') #LABEL: 2
print(f'\nIMAGE PIXEL ARRAY:\n {training_images[index]}')
'''
IMAGE PIXEL ARRAY:
[[ 0 0 0 0 0 0 0 0 0 41 140 98 37 33 33 65 138 59 5 0 0 0 0 1 0 0 0 0]
[ 0 0 0 1 0 0 35 116 156 168 148 152 154 134 132 120 88 118 150 136 94 19 0 0 0 0 0 0]
[ 0 0 0 0 0 53 152 134 132 122 136 114 94 92 81 85 98 98 100 116 110 150 21 0 1 0 0 0]
[ 0 0 0 0 0 168 132 134 114 124 122 118 104 106 98 98 92 94 106 92 75 106 83 0 0 0 0 0]
[ 0 0 0 0 0 181 134 136 114 116 108 104 96 100 94 94 92 94 104 92 86 90 118 0 0 0 0 0]
[ 0 0 0 0 19 187 158 126 130 104 112 100 100 94 94 98 88 94 96 83 86 88 126 7 0 0 0 0]
[ 0 0 0 0 53 158 179 116 140 98 108 100 100 94 94 96 90 90 88 77 85 75 94 15 0 0 0 0]
[ 0 0 0 0 90 138 183 142 142 106 94 90 92 92 92 96 88 86 81 67 83 81 98 43 0 0 0 0]
[ 0 0 0 0 102 124 189 175 136 100 85 86 88 90 92 92 88 81 77 65 98 90 96 59 0 0 0 0]
[ 0 0 0 0 106 120 179 197 124 90 88 86 88 85 85 86 88 81 71 63 94 88 96 61 0 0 0 0]
[ 0 0 0 0 110 118 177 223 92 86 96 88 90 86 83 85 92 86 69 59 118 88 94 73 0 0 0 0]
[ 0 0 0 0 116 114 191 243 63 94 98 88 86 88 83 85 98 90 77 59 156 85 90 85 0 0 0 0]
[ 0 0 0 0 122 110 213 255 47 96 104 85 81 86 81 83 90 98 71 59 177 81 96 86 0 0 0 0]
[ 0 0 0 0 126 102 235 245 45 104 106 86 83 83 79 81 92 106 73 63 156 96 104 96 0 0 0 0]
[ 0 0 0 5 128 94 251 233 47 102 110 86 85 81 75 79 88 104 79 63 122 124 81 112 0 0 0 0]
[ 0 0 0 21 138 85 255 225 51 104 110 90 85 79 73 83 92 112 86 37 140 106 75 120 5 0 0 0]
[ 0 0 0 25 154 75 247 225 61 104 112 92 83 77 75 83 92 114 81 49 158 96 94 98 13 0 0 0]
[ 0 0 0 35 156 75 213 223 81 108 122 98 86 81 75 83 90 116 86 53 130 108 104 92 25 0 0 0]
[ 0 0 0 49 142 77 199 219 96 110 124 94 81 73 67 75 88 112 90 47 130 100 104 98 39 0 0 0]
[ 0 0 0 67 130 86 201 217 81 92 106 88 75 73 71 77 75 94 94 53 136 98 86 98 47 0 0 0]
[ 0 0 0 81 136 81 207 217 85 108 112 98 86 88 86 98 98 104 110 67 170 118 65 108 63 0 0 0]
[ 0 0 0 71 154 63 197 199 104 122 118 106 96 90 85 96 110 104 102 73 128 130 73 108 69 0 0 0]
[ 0 0 0 90 154 49 185 177 130 118 120 112 104 102 92 100 118 112 108 83 124 154 71 116 53 0 0 0]
[ 0 0 0 102 150 55 175 0 0 0 0 0 0 0 0 0 0 0 0 0 0 221 81 102 59 0 0 0]
[ 0 0 0 104 150 79 122 0 0 0 0 0 0 0 0 0 0 0 0 0 0 166 120 86 67 0 0 0]
[ 0 0 0 106 132 96 85 0 0 1 0 0 0 1 1 1 1 0 1 0 0 144 114 102 71 0 0 0]
[ 0 0 0 94 146 102 71 0 0 1 0 0 0 0 0 0 0 0 0 0 0 134 124 98 57 0 0 0]
[ 0 0 0 53 142 102 37 0 1 0 0 0 0 0 0 0 0 0 0 0 0 98 128 104 43 0 0 0]]
'''
# Visualize the image
plt.imshow(training_images[index])
โข normalization
: all of the values in the number are between 0 and 255. If you are training a neural network especially in image processing, for various reasons it will usually learn better if you scale all values to between 0 and 1. It's a process called normalization and fortunately in Python, it's easy to normalize an array without looping.
# Normalize the pixel values of the train and test images
training_images = training_images / 255.0
test_images = test_images / 255.0
building a classification model
โฃ building the classification model
# Build the classification model
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
(1) Sequential: defines a sequence of layers in the neural network.
(2) Flatten: Flatten just takes 28x28 pixel matrix and turns it into a 1-dimensional array.
(3) Dense: Adds a layer of neurons. Each layer of neurons need an activation function to tell them what to do. There are a lot of options
→ ReLU only passes values greater than 0 to the next layer in the network
→ Softmax takes a list of values and scales these so the sum of all elements will be equal to 1. When applied to model outputs, you can think of the scaled values as the probability for that class. For example, in your classification model which has 10 units in the output dense layer, having the highest value at index = 4 means that the model is most confident that the input clothing image is a coat. If it is at index = 5, then it is a sandal, and so forth. See the short code block below which demonstrates these concepts. You can also watch this lecture if you want to know more about the Softmax function and how the values are computed.
ex) softmax function demonstration
# Declare sample inputs and convert to a tensor
inputs = np.array([[1.0, 3.0, 4.0, 2.0]])
inputs = tf.convert_to_tensor(inputs)
print(f'input to softmax function: {inputs.numpy()}')
# Feed the inputs to a softmax activation function
outputs = tf.keras.activations.softmax(inputs)
print(f'output of softmax function: {outputs.numpy()}')
# Get the sum of all values after the softmax
sum = tf.reduce_sum(outputs)
print(f'sum of outputs: {sum}')
# Get the index with highest value
prediction = np.argmax(outputs)
print(f'class with highest probability: {prediction}')
'''
input to softmax function: [[1. 3. 4. 2.]]
output of softmax function: [[0.0320586 0.23688282 0.64391426 0.08714432]]
sum of outputs: 1.0
class with highest probability: 2
'''
: 4๊ฐ ์๋ index 2๊ฐ ๊ฐ์ฅ value๊ฐ ํฌ๋ฏ๋ก softmax function output ๊ฒฐ๊ณผ 0.64๋ก ์ญ์ ๊ฐ์ฅ ํฐ ๋น์ค์ ์ฐจ์ง. softmax function ๊ฒฐ๊ณผ๋ก ๋์จ ๋ชจ๋ output์ ํฉ์ 1์ด๋ฏ๋ก reduce_sum() ํจ์๋ฅผ ํตํด 1์ด ๋์์ ํ์ธ ๊ฐ๋ฅ. np์ argmax() ํจ์๋ก ๊ฐ์ฅ ํฐ element์ index return.
โค building & fitting training data
: actually building the model. you do this by compiling it with an optimizer and loss function as before -- and then you train it by calling model.fit() asking it to fit your training data to your training labels. It will figure out the relationship between the training data and its actual labels so in the future if you have inputs that looks like the training data, then it can predict what the label for that input is.
model.compile(optimizer = tf.optimizers.Adam(),
loss = 'sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5)
'''
1875/1875 [==============================] - 3s 2ms/step - loss: 0.5013 - accuracy: 0.8230
Epoch 2/5
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3780 - accuracy: 0.8628
Epoch 3/5
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3374 - accuracy: 0.8757
Epoch 4/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3156 - accuracy: 0.8844
Epoch 5/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2969 - accuracy: 0.8895
'''
: Once it's done training -- you should see an accuracy value at the end of the final epoch. It might look something like 0.8895. This tells you that your neural network is about 89% accurate in classifying the training data. That is, it figured out a pattern match between the image and the labels that worked 89% of the time. Not great, but not bad considering it was only trained for 5 epochs and done quite quickly.
evaluating & predicting
โฅ evaluation
: But how would it work with unseen data? That's why we have the test images and labels. We can call model.evaluate() with this test dataset as inputs and it will report back the loss and accuracy of the model.
# Evaluate the model on unseen data
model.evaluate(test_images, test_labels)
#313/313 [==============================] - 1s 1ms/step - loss: 0.3436 - accuracy: 0.8794
#[0.34361031651496887, 0.8794000148773193]
: You can expect the accuracy here to be about 0.8794which means it was 87.94% accurate on the entire test set. As expected, it probably would not do as well withunseen data as it did with data it was trained on!
โฆ prdicting an image with index 915 (with the model we built above)
classifications = model.predict(test_images)
print(classifications)
print(classifications[915])
prediction = np.argmax(classifications[915])
print(f'class with highest probability: {prediction}')
'''
313/313 [==============================] - 0s 1ms/step
[[6.2409208e-06 9.0771358e-08 2.1576885e-05 ... 2.1569857e-02 8.9094639e-05 9.7474480e-01]
[8.7240205e-06 5.8890087e-11 9.9830294e-01 ... 1.2497652e-17 9.7774171e-09 4.2074684e-17]
[8.0920754e-06 9.9998975e-01 1.2109047e-07 ... 7.6645480e-18 2.9345939e-09 4.4589631e-15]
...
[2.6315968e-03 6.0284986e-09 1.1105484e-03 ... 1.5767324e-07 9.9230075e-01 8.9485230e-10]
[4.3416367e-06 9.9987364e-01 7.8685781e-07 ... 1.3830411e-12 1.4922641e-08 2.8705447e-08]
[1.2723396e-04 5.3855490e-07 2.4892579e-04 ... 2.0233687e-02 3.0123170e-03 4.5927553e-04]]
[6.4266347e-03 1.8270270e-05 9.8238903e-01 4.2550218e-06 8.0206711e-03 3.6529058e-09 3.1218783e-03 1.9612825e-11 1.9301558e-05 7.1107530e-11]
class with highest probability: 2
'''
: ์ค์ index 915์ label์ ์ถ๋ ฅํด๋ณด๋ฉด 2๊ฐ ๋์ค๋ฏ๋ก ๋ง๋ ๋ชจ๋ธ์ด ์ฌ๋ฐ๋ฅด๊ฒ 915 index image๋ฅผ class 2๋ก ์์ธกํ์์ ์ ์ ์๋ค. class 2๋ index 2๋ฅผ ๋ปํ๋ฏ๋ก ์ table์ ์ํด Pullover๋ก ์ฌ๋ฐ๋ฅด๊ฒ ์์ธกํ์.
print(test_labels[915]) #2
tuning a model
โง 1024 neurons (512 neurons ์๋ต)
: Experiment with different values for the dense layer with 1024 neurons
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) , (test_images, test_labels) = fmnist.load_data()
training_images = training_images/255.0
test_images = test_images/255.0
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1024, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.4718 - accuracy: 0.8307
Epoch 2/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3587 - accuracy: 0.8682
Epoch 3/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3219 - accuracy: 0.8823
Epoch 4/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2964 - accuracy: 0.8897
Epoch 5/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2779 - accuracy: 0.8960
313/313 [==============================] - 1s 2ms/step - loss: 0.3408 - accuracy: 0.8801
'''
: accuracy๊ฐ 0.8801๋ก ์์ 0.8794 ์ ํ๋๋ณด๋ค ์ํญ ์ฆ๊ฐ.
โจ removing Flatten layer / differing the number of ouput nodes
: Flatten ๊ณผ์ ์ ๊ฑฐ์ณ์ผ๋ง neural network์ ์๋ง์ shape์ผ๋ก ํ๋ จ์ด ๊ฐ๋ฅํ๋ฏ๋ก ์๋ฌ ๋ฐ์. ๋ง์ฐฌ๊ฐ์ง๋ก neural network์ ์๋ง์ ๊ฐ์์ ๊ฒฐ๊ณผ๋ก output์ด ๋์์ผ ํ๋ฏ๋ก ๋๊ฐ์ด ์๋ฌ ๋ฐ์
โฉ adding another layer btw the one with 512 and the final layer with 10
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) , (test_images, test_labels) = fmnist.load_data()
training_images = training_images/255.0
test_images = test_images/255.0
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(256, activation=tf.nn.relu), #added a layer
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.4679 - accuracy: 0.8304
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3581 - accuracy: 0.8685
Epoch 3/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3174 - accuracy: 0.8830
Epoch 4/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2970 - accuracy: 0.8889
Epoch 5/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2783 - accuracy: 0.8962
313/313 [==============================] - 1s 2ms/step - loss: 0.3441 - accuracy: 0.8771
'''
: ์ค๊ฐ์ hidden layer๋ฅผ ํ ๊ฐ ๋ ์ถ๊ฐํ ๊ฒฐ๊ณผ accuracy๊ฐ 0.8771๋ก ์ ์๋ฏธํ ๋ณํ๋ ์๋ค.
โช more or less epochs
→ 15 epochs) 0.8801๋ก ์ ์๋ฏธํ ๋ณํ๋ ์๋ค.
#15 epochs
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) , (test_images, test_labels) = fmnist.load_data()
training_images = training_images/255.0
test_images = test_images/255.0
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=15) # Experiment with the number of epochs
model.evaluate(test_images, test_labels)
'''
Epoch 1/15
1875/1875 [==============================] - 3s 2ms/step - loss: 0.4965 - accuracy: 0.8239
Epoch 2/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3757 - accuracy: 0.8649
Epoch 3/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3378 - accuracy: 0.8777
Epoch 4/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.3154 - accuracy: 0.8850
Epoch 5/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2956 - accuracy: 0.8909
Epoch 6/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2813 - accuracy: 0.8957
Epoch 7/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2677 - accuracy: 0.8999
Epoch 8/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2569 - accuracy: 0.9051
Epoch 9/15
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2470 - accuracy: 0.9079
Epoch 10/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2401 - accuracy: 0.9116
Epoch 11/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2324 - accuracy: 0.9133
Epoch 12/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2243 - accuracy: 0.9163
Epoch 13/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2170 - accuracy: 0.9189
Epoch 14/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2111 - accuracy: 0.9222
Epoch 15/15
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2054 - accuracy: 0.9239
313/313 [==============================] - 0s 1ms/step - loss: 0.3568 - accuracy: 0.8801
'''
→ 30 epochs) 0.8853์ผ๋ก ๊ทธ๋๋ accuracy๊ฐ ๊ฐ์ฅ ๋๊ฒ ์ธก์ ๋จ.
#30 epochs
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) , (test_images, test_labels) = fmnist.load_data()
training_images = training_images/255.0
test_images = test_images/255.0
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=30) # Experiment with the number of epochs
model.evaluate(test_images, test_labels)
'''
Epoch 1/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.5035 - accuracy: 0.8238
Epoch 2/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3792 - accuracy: 0.8635
Epoch 3/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3402 - accuracy: 0.8756
Epoch 4/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.3129 - accuracy: 0.8851
Epoch 5/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2930 - accuracy: 0.8914
Epoch 6/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2790 - accuracy: 0.8971
Epoch 7/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2679 - accuracy: 0.9003
Epoch 8/30
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2568 - accuracy: 0.9043
Epoch 9/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2480 - accuracy: 0.9075
Epoch 10/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2375 - accuracy: 0.9110
Epoch 11/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2300 - accuracy: 0.9143
Epoch 12/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.2231 - accuracy: 0.9155
Epoch 13/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2172 - accuracy: 0.9201
Epoch 14/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2094 - accuracy: 0.9205
Epoch 15/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2036 - accuracy: 0.9244
Epoch 16/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1980 - accuracy: 0.9265
Epoch 17/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1929 - accuracy: 0.9275
Epoch 18/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1867 - accuracy: 0.9301
Epoch 19/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1830 - accuracy: 0.9319
Epoch 20/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1767 - accuracy: 0.9337
Epoch 21/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1741 - accuracy: 0.9344
Epoch 22/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1680 - accuracy: 0.9373
Epoch 23/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1667 - accuracy: 0.9375
Epoch 24/30
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1616 - accuracy: 0.9383
Epoch 25/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1572 - accuracy: 0.9409
Epoch 26/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1561 - accuracy: 0.9405
Epoch 27/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1505 - accuracy: 0.9425
Epoch 28/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1490 - accuracy: 0.9432
Epoch 29/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1453 - accuracy: 0.9443
Epoch 30/30
1875/1875 [==============================] - 3s 1ms/step - loss: 0.1415 - accuracy: 0.9460
313/313 [==============================] - 1s 1ms/step - loss: 0.4024 - accuracy: 0.8853
'''
โซ without normalization
: normalization์ ๊ฑฐ์น๋ ๊ณผ์ ์์ด model์ ๋๋ ธ๋ค๋ฉด 0.8366์ผ๋ก ์ ํ๋๊ฐ normalization ํ์ ๋์ ๋นํด ๋งค์ฐ ๋จ์ด์ง๋ค. ๊ทธ ์ด์ ๋
(1) slower convergence: Neural networks often converge faster when input data is normalized. This is because having features on similar scales helps the optimization algorithm find the minimum more efficiently. ์ต์ ํ ์๊ณ ๋ฆฌ์ฆ์ ์๋๋ฅผ ๋ฆ์ถฐ์ค๋ค.
(2) weight sensitivity: Neural networks might become sensitive to the scale of input features. Without normalization, some weights may become much larger or smaller than others, potentially leading to numerical instability. scale์ ์ํฅ์ ๋ฐ๊ฒ ๋์ด ๋ชจ๋ feature๊ฐ model prediction์ ๊ณตํํ๊ฒ ๋ฐ์๋์ง ์๋๋ค.
(3) model generalization: Normalization can help improve the generalization of the model to new, unseen data. If the model is trained without normalization, it might not perform as well on data outside the training set.
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) , (test_images, test_labels) = fmnist.load_data()
# training_images=training_images/255.0 # Experiment with removing this line
# test_images=test_images/255.0 # Experiment with removing this line
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)
'''
Epoch 1/5
1875/1875 [==============================] - 5s 3ms/step - loss: 4.6223 - accuracy: 0.7620
Epoch 2/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5465 - accuracy: 0.8155
Epoch 3/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5117 - accuracy: 0.8235
Epoch 4/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.5009 - accuracy: 0.8284
Epoch 5/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.4882 - accuracy: 0.8352
313/313 [==============================] - 1s 2ms/step - loss: 0.5133 - accuracy: 0.8366
'''
โฌ callback
: training ๋์ค ์ํ๋ accuracy๊ฐ ๋๋ฌํด์ ๋ ์ด์ training์ด ํ์์๋ค๊ณ ํ๋จ๋ ๋ ํ๋จ์ ์๋ฃํ๊ณ prediction ์งํ
ex) ์์ 30 epochs์์ ๋์ค์ accuracy๊ฐ 0.92 ์ด์์ผ ๊ฒฝ์ฐ ๋ ์ด์ epoch๋ฅผ ์งํํ์ง ์๊ณ ๋ฐ๋ก ์๋ฃํ ๋ค prediction ์งํ. ๊ทธ ๊ฒฐ๊ณผ, 30 epochs ์ค 12 epoch๋ง ์งํํ์ผ๋ฉฐ 30 epochs ์งํํ์ ๋์ 0.8853 accuracy๋ณด๋ค ๋ ํฅ์๋ 0.8927 accuracy ์ฐ์ถ.
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('accuracy') >= 0.92): # Experiment with changing this value
print("\nReached 60% accuracy so cancelling training!")
self.model.stop_training = True
callbacks = myCallback()
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels) , (test_images, test_labels) = fmnist.load_data()
training_images=training_images/255.0
test_images=test_images/255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=30, callbacks=[callbacks])
model.evaluate(test_images, test_labels)
'''
Epoch 1/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.4728 - accuracy: 0.8321
Epoch 2/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3611 - accuracy: 0.8662
Epoch 3/30
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3243 - accuracy: 0.8799
Epoch 4/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2987 - accuracy: 0.8894
Epoch 5/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2807 - accuracy: 0.8955
Epoch 6/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2644 - accuracy: 0.9016
Epoch 7/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2519 - accuracy: 0.9053
Epoch 8/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2413 - accuracy: 0.9112
Epoch 9/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2297 - accuracy: 0.9141
Epoch 10/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2232 - accuracy: 0.9171
Epoch 11/30
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2118 - accuracy: 0.9198
Epoch 12/30
1873/1875 [============================>.] - ETA: 0s - loss: 0.2030 - accuracy: 0.9231
Reached 60% accuracy so cancelling training!
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2030 - accuracy: 0.9231
313/313 [==============================] - 1s 2ms/step - loss: 0.3272 - accuracy: 0.8927
'''
accuracy analysis
โญ fashionMNIST Classifier ๋ชจ๋ธ test image accuracy ์ฑ๋ฅ ์ ๋ฆฌ
โป ์๋ ๊ธฐ์ค ์ธ ๋๋จธ์ง ์ธ์๋ ๋ชจ๋ ๋์ผํ๊ฒ ์ค์ โป
โ epochs 15 / 30์ผ๋ก ์ค์ ํ case๊ฐ epochs 5๋ก ์ค์ ํ case๋ณด๋ค ์ฑ๋ฅ์ด ๋๊ฒ ๋์ด์ ์ ์ ์๋ค.
โ normalization์ ์ค์ ํ์ง ์์ผ๋ฉด ์ฑ๋ฅ์ด ๋๋ผ๋งํฑํ๊ฒ ๊ฐ์ํ์์ ์ ์ ์๋ค.
โ ๋์ผํ ์กฐ๊ฑด์ hidden layer์ ์ถ๊ฐํ ๊ฒฝ์ฐ ์ฑ๋ฅ์ด 0.8747์์ 0.8771๋ก ์ํญ ์ฆ๊ฐํ์์ ์ ์ ์๋ค.
โ callback์ ์ค์ ํ์ฌ trainingํ ๊ฒฐ๊ณผ ์ฑ๋ฅ์ด 0.92 ์ด์์ด๋ฉด ์ค๋จํ์ ๊ฒฝ์ฐ 12 epochs๋ง ์งํํ๊ณ , ์ ์ฒด ๋ชจ๋ ์คํ ์ค ๊ฐ์ฅ ์ฑ๋ฅ์ด ๋๊ฒ ๋์๋ค. ๋์ผํ ์กฐ๊ฑด์ callback์ ์งํํ์ง ์์ (์์์) 6๋ฒ์งธ ์คํ๊ณผ ๋น๊ตํ๋ฉด, 30 epochs ์ ์ฒด๋ฅผ ์งํํ์ ๋๋ณด๋ค callback ๊ธฐ๋ฅ ์ค์ ์ผ๋ก 12 epochs์ ๋ฉ์ถ ๊ฒฐ๊ณผ๊ฐ ์ผ๋ฐํ ์ฑ๋ฅ์ด ๋์๋ค. ์ด๋, 30 epochs๊น์ง ๋๋ฆฐ case๊ฐ ์ค๋ฒํผํ ์ ๊ฐ๋ฅ์ฑ์ด ์์์ ์ ์ถํด๋ณผ ์ ์๋ค.
Coursera <Intro to Tensorflow for AI, ML, and DL>
๋๊ธ