CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
better-data-science

CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!

GitHub Repository: better-data-science/TensorFlow
Path: blob/main/015_CNN_008_Transfer_Learning.ipynb
Views: 47
Kernel: Python 3 (ipykernel)

CNN 8 - Transfer Learning

What you should know by now:

  • How to preprocess image data

  • How to load image data from a directory

  • What's a convolution, pooling, and a fully-connected layer

  • Categorical vs. binary classification

  • What is data augmentation and why is it useful

Let's start

  • We'll import the libraries first:

import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' import warnings warnings.filterwarnings('ignore') import numpy as np import tensorflow as tf
  • We'll have to load training and validation data from different directories throughout the notebook

  • The best practice is to declare a function for that

  • The function will also apply data augmentation to the training dataset:

def init_data(train_dir: str, valid_dir: str) -> tuple: train_datagen = tf.keras.preprocessing.image.ImageDataGenerator( rescale=1/255.0, rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest' ) valid_datagen = tf.keras.preprocessing.image.ImageDataGenerator( rescale=1/255.0 ) train_data = train_datagen.flow_from_directory( directory=train_dir, target_size=(224, 224), class_mode='categorical', batch_size=64, seed=42 ) valid_data = valid_datagen.flow_from_directory( directory=valid_dir, target_size=(224, 224), class_mode='categorical', batch_size=64, seed=42 ) return train_data, valid_data
  • Let's now load our dogs and cats dataset:

train_data, valid_data = init_data( train_dir='data/train/', valid_dir='data/validation/' )
Found 20030 images belonging to 2 classes. Found 2488 images belonging to 2 classes.

Transfer Learning in TensorFlow

  • With transfer learning, we're basically loading a huge pretrained model without the top clasification layer

  • That way, we can freeze the learned weights and only add the output layer to match our case

  • For example, most pretrained models were trained on ImageNet dataset which has 1000 classes

    • We only have two classes (cat and dog), so we'll need to specify that

  • We'll also add a couple of additional layers to prevent overfitting:

def build_transfer_learning_model(base_model): # `base_model` stands for the pretrained model # We want to use the learned weights, and to do so we must freeze them for layer in base_model.layers: layer.trainable = False # Declare a sequential model that combines the base model with custom layers model = tf.keras.Sequential([ base_model, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.BatchNormalization(), tf.keras.layers.Dropout(rate=0.2), tf.keras.layers.Dense(units=2, activation='softmax') ]) # Compile the model model.compile( loss='categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy'] ) return model
# Let's use a simple and well-known architecture - VGG16 from tensorflow.keras.applications.vgg16 import VGG16 # We'll specify it as a base model # `include_top=False` means we don't want the top classification layer # Specify the `input_shape` to match our image size # Specify the `weights` accordingly vgg_model = build_transfer_learning_model( base_model=VGG16(include_top=False, input_shape=(224, 224, 3), weights='imagenet') ) # Train the model for 10 epochs vgg_hist = vgg_model.fit( train_data, validation_data=valid_data, epochs=10 )
Metal device set to: Apple M1 Pro Epoch 1/10 313/313 [==============================] - 160s 510ms/step - loss: 0.3786 - accuracy: 0.8258 - val_loss: 0.3144 - val_accuracy: 0.8943 Epoch 2/10 313/313 [==============================] - 160s 510ms/step - loss: 0.2897 - accuracy: 0.8712 - val_loss: 0.1988 - val_accuracy: 0.9224 Epoch 3/10 313/313 [==============================] - 160s 510ms/step - loss: 0.2751 - accuracy: 0.8800 - val_loss: 0.1944 - val_accuracy: 0.9216 Epoch 4/10 313/313 [==============================] - 160s 510ms/step - loss: 0.2717 - accuracy: 0.8812 - val_loss: 0.1820 - val_accuracy: 0.9264 Epoch 5/10 313/313 [==============================] - 160s 511ms/step - loss: 0.2699 - accuracy: 0.8829 - val_loss: 0.1809 - val_accuracy: 0.9268 Epoch 6/10 313/313 [==============================] - 160s 511ms/step - loss: 0.2709 - accuracy: 0.8822 - val_loss: 0.1792 - val_accuracy: 0.9297 Epoch 7/10 313/313 [==============================] - 160s 511ms/step - loss: 0.2668 - accuracy: 0.8852 - val_loss: 0.1763 - val_accuracy: 0.9236 Epoch 8/10 313/313 [==============================] - 162s 516ms/step - loss: 0.2688 - accuracy: 0.8817 - val_loss: 0.1889 - val_accuracy: 0.9212 Epoch 9/10 313/313 [==============================] - 160s 511ms/step - loss: 0.2667 - accuracy: 0.8857 - val_loss: 0.1760 - val_accuracy: 0.9264 Epoch 10/10 313/313 [==============================] - 160s 511ms/step - loss: 0.2685 - accuracy: 0.8836 - val_loss: 0.1802 - val_accuracy: 0.9281
  • We got amazing accuracy right from the start!

  • We couldn't surpass 77% accuracy on the validation set with the custom architecture, and we're at 93% with the VGG16 model

  • The beauty of transfer learning isn't only that it yields a highly accurate models - you can also train models with less data, as the model doesn't have to learn as much


Transfer Learning on a 20 times smaller subset

  • We want to see if reducing the dataset size negatively effects the predictive power

  • To do so, we'll create a new directory structure for training and validation images:

import random import pathlib import shutil random.seed(42) dir_data = pathlib.Path.cwd().joinpath('data_small') dir_train = dir_data.joinpath('train') dir_valid = dir_data.joinpath('validation') if not dir_data.exists(): dir_data.mkdir() if not dir_train.exists(): dir_train.mkdir() if not dir_valid.exists(): dir_valid.mkdir() for cls in ['cat', 'dog']: if not dir_train.joinpath(cls).exists(): dir_train.joinpath(cls).mkdir() if not dir_valid.joinpath(cls).exists(): dir_valid.joinpath(cls).mkdir()
  • Here's the directory structure printed:

!ls -R data_small | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/'
|-train |---cat |---dog |-validation |---cat |---dog
  • Now, we'll copy only a sample of images to the new folders

  • We'll declare a copy_sample() function whcih takes n images from the src_folder and copies them to the tgt_folder

  • We'll keep n to 500 by default, which is a pretty small number:

def copy_sample(src_folder: pathlib.PosixPath, tgt_folder: pathlib.PosixPath, n: int = 500): imgs = random.sample(list(src_folder.iterdir()), n) for img in imgs: img_name = str(img).split('/')[-1] shutil.copy( src=img, dst=f'{tgt_folder}/{img_name}' )
  • Let's now copy the training and validation images

  • For the validation set, we'll copy only 100 images per class

# Train - cat copy_sample( src_folder=pathlib.Path.cwd().joinpath('data/train/cat/'), tgt_folder=pathlib.Path.cwd().joinpath('data_small/train/cat/'), ) # Train - dog copy_sample( src_folder=pathlib.Path.cwd().joinpath('data/train/dog/'), tgt_folder=pathlib.Path.cwd().joinpath('data_small/train/dog/'), ) # Valid - cat copy_sample( src_folder=pathlib.Path.cwd().joinpath('data/validation/cat/'), tgt_folder=pathlib.Path.cwd().joinpath('data_small/validation/cat/'), n=100 ) # Valid - dog copy_sample( src_folder=pathlib.Path.cwd().joinpath('data/validation/dog/'), tgt_folder=pathlib.Path.cwd().joinpath('data_small/validation/dog/'), n=100 )
  • Let's count the number of files in each folder to verify the images were copied successfully:

!ls data_small/train/cat/ | wc -l
500
!ls data_small/validation/cat/ | wc -l
100
!ls data_small/train/dog/ | wc -l
500
!ls data_small/validation/dog/ | wc -l
100
  • Now use init_data() to load in the images again:

train_data, valid_data = init_data( train_dir='data_small/train/', valid_dir='data_small/validation/' )
Found 1000 images belonging to 2 classes. Found 200 images belonging to 2 classes.
  • There's total of 1000 training images

  • It will be interesting to see if we can get a decent model out of a dataset this small

  • Model architecture is the same, but we'll train for more epochs just because the dataset is smaller

    • Also, we can afford to train for longer since the training time per epoch is reduced:

vgg_model = build_transfer_learning_model( base_model=VGG16(include_top=False, input_shape=(224, 224, 3), weights='imagenet') ) vgg_hist = vgg_model.fit( train_data, validation_data=valid_data, epochs=20 )
Epoch 1/20 16/16 [==============================] - 9s 572ms/step - loss: 0.8472 - accuracy: 0.5740 - val_loss: 0.7049 - val_accuracy: 0.5100 Epoch 2/20 16/16 [==============================] - 9s 551ms/step - loss: 0.6389 - accuracy: 0.6840 - val_loss: 0.6876 - val_accuracy: 0.5150 Epoch 3/20 16/16 [==============================] - 9s 551ms/step - loss: 0.4936 - accuracy: 0.7800 - val_loss: 0.6461 - val_accuracy: 0.5300 Epoch 4/20 16/16 [==============================] - 9s 552ms/step - loss: 0.4318 - accuracy: 0.8020 - val_loss: 0.6082 - val_accuracy: 0.5850 Epoch 5/20 16/16 [==============================] - 9s 552ms/step - loss: 0.3935 - accuracy: 0.8270 - val_loss: 0.5831 - val_accuracy: 0.6450 Epoch 6/20 16/16 [==============================] - 9s 551ms/step - loss: 0.3945 - accuracy: 0.8100 - val_loss: 0.5638 - val_accuracy: 0.7000 Epoch 7/20 16/16 [==============================] - 9s 545ms/step - loss: 0.3444 - accuracy: 0.8300 - val_loss: 0.5374 - val_accuracy: 0.7350 Epoch 8/20 16/16 [==============================] - 9s 553ms/step - loss: 0.3490 - accuracy: 0.8510 - val_loss: 0.5064 - val_accuracy: 0.8100 Epoch 9/20 16/16 [==============================] - 9s 552ms/step - loss: 0.3523 - accuracy: 0.8330 - val_loss: 0.4810 - val_accuracy: 0.8500 Epoch 10/20 16/16 [==============================] - 9s 553ms/step - loss: 0.3317 - accuracy: 0.8610 - val_loss: 0.4618 - val_accuracy: 0.8650 Epoch 11/20 16/16 [==============================] - 9s 552ms/step - loss: 0.3084 - accuracy: 0.8740 - val_loss: 0.4410 - val_accuracy: 0.8800 Epoch 12/20 16/16 [==============================] - 9s 551ms/step - loss: 0.2890 - accuracy: 0.8740 - val_loss: 0.4182 - val_accuracy: 0.8850 Epoch 13/20 16/16 [==============================] - 9s 552ms/step - loss: 0.2823 - accuracy: 0.8780 - val_loss: 0.3945 - val_accuracy: 0.9200 Epoch 14/20 16/16 [==============================] - 9s 552ms/step - loss: 0.3029 - accuracy: 0.8610 - val_loss: 0.3769 - val_accuracy: 0.9100 Epoch 15/20 16/16 [==============================] - 9s 552ms/step - loss: 0.2998 - accuracy: 0.8590 - val_loss: 0.3614 - val_accuracy: 0.9150 Epoch 16/20 16/16 [==============================] - 9s 552ms/step - loss: 0.2905 - accuracy: 0.8790 - val_loss: 0.3403 - val_accuracy: 0.9300 Epoch 17/20 16/16 [==============================] - 9s 555ms/step - loss: 0.2736 - accuracy: 0.8740 - val_loss: 0.3255 - val_accuracy: 0.9400 Epoch 18/20 16/16 [==============================] - 9s 553ms/step - loss: 0.2956 - accuracy: 0.8780 - val_loss: 0.3126 - val_accuracy: 0.9200 Epoch 19/20 16/16 [==============================] - 9s 563ms/step - loss: 0.2556 - accuracy: 0.8920 - val_loss: 0.2992 - val_accuracy: 0.9150 Epoch 20/20 16/16 [==============================] - 9s 561ms/step - loss: 0.2718 - accuracy: 0.8820 - val_loss: 0.2887 - val_accuracy: 0.9150
  • It looks like we got roughly the same validation accuracy as with the model trained on 25K images, which is amazing!

Homework:

  • Use both models to predict the entire test set directory

  • How do the accuracies compare?