CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
Path: blob/main/010_CNN_003_Image_Classification_With_CNN.ipynb
Views: 47
CNN 3 - Getting started with Convolutional layers
Dataset:
The dataset isn't deep-learning-compatible by default, here's how to preprocess it:
Before you start
I got TensorFlow errors during training because a couple of images were corrupted
Before continuing, please delete the following images:
data\train\cat\666.jpg
data\train\dog\11702.jpg
Normalizing image data
Let's load in a sample image:
And check it's shape
It's 281 pixels wide, 300 pixels tall, and has 3 color channels
Let's load in another image and see if the same applies:
The second image is much larger
Neural network doesn't like that - it expects images (arrays) of identical sizes
You'll see later how to resize them on the fly
First, let's see how a single image looks like when represented as an array:
It's in a range between 0 and 255 for every single color channel (red, green, and blue)
Neural networks prefer a range between 0 and 1
We can translate it to that range by dividing each element of an array by 255.0:
That's the only argument we'll pass to the TensorFlow's ImageDataGenerator - rescaling
There are others available, and we'll cover them in a couple of notebooks when learning data augmentation
Data loaders
You can use the
ImageDataGenerator
class from TensorFlow to specify how the image data will be generatedWe'll only apply rescaling - 1 / 255.0
We'll do this for both trianing and validation images:
You can now use this generator to load in data from a directory
Specify the directory path, and a siye to which each image will be resized
224x224 works well with neural networks, especially with transfer learning models (more on these in a couple of notebooks)
Set
class_mode='categorical'
, since we have two distinct classesSet
batch_siye=64
or anything you want, it represents the number of images shown to a neural network at onceThe
seed
parameter is here so you can get the same images as I did:
There are 20030 images in the training folder divided into two classes - as reported by the loader
The
train_data
is basically a Python generator objectYou can call
next()
on it to get the first batch:
Each batch contains images and labels
Let's check the shape:
So, a single batch contains 64 images, each being 224 pixels wide and tall with 3 color channels
There are 64 corresponding labels, each is an array of two elements - probability of an image being a cat (0) ond a dog (1)
Visualizing a single batch
It's always recommended to visalize your data
The
visualize_batch()
function, well, visualizes a single batchThere are 64 images in the batch, so the function plots a grid of 8x8 images:
Some of them look a bit weird due to change in the aspect ratio, but we should be fine
Let's reset the data loaders, as we called
next()
before:
Training a Convolutional model
Just like with regular ANN's (Dense layers), Convolutional Neural Networks boil down to experimentation
You can't know beforehand how many Convolutional layers you'll need, what's the ideal number of filters for each, and what's the optimal kernel size
Convolutional layers are usually followed by a Pooling layer, to reduce the image size
When finished with Convolutional layers, make sure to add a Flatten layer
Add Dense layers as you normally would from there
Keep in mind the ouput layer and the loss function
Use softmax activation at output, as sigmoid only works when you have a single output node
Track loss through categorical cross entropy
We'll train the model for 10 epochs, which is completely random:
71.23% accuracy after 10 epochs
Does doubling the number of filters in our single Convolutional layers make a difference?
Maybe, but the model generally doesn't look like it's learning
Let's add another Convolutional layer
Keep in mind: Only the first convolutional layer needs the
input_shape
parameter
Much better - we're at 75% now on the validation set
Let's use this model to make predictions
Making predictions on new images
You have to apply the same preprocessing operations to the unseen images
I've forgot to do so many times on my job, and it results in some wierd and uncertain predictions (small difference between prediction probabilities)
We'll declare a
prepare_single_image()
function which resizes an image to 224x224 and rescales it to a 0-1 range:
Let's test it on a single image:
And now let's make a single prediction
Note the
reshape()
function - try removing it and see what happensThere's an easier way, and you'll see it in a bit
These are basically prediction probabilities
The model almost 100% certain that the class at index 0 is present on the image
Remember: 0 = cat, 1 = dog
You can use the argmax function to get the index where the value of an array is the highest:
Let+s make predictions for an entire folder of images
First for the cats
The top two variables will track how many predictions were made, and how many of these were correct
Note the
expand_dims()
function - it's an alternative toreshape()
You can use either
Prediction fails on some images probably because they are corrupted, so wrap the code inside a
try .. except
block:
Total predictions made:
Accuracy for cats:
Not too bad - let's do the same for dogs:
Overall, we have a much more accurate model than when we were only using Dense layers
This is just a tip of the iceberg
We haven't explored data augmentation and transfer learning
You wouldn't believe how much these will increase the accuracy