CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
Path: blob/main/012_CNN_005_Pooling_From_Scratch.ipynb
Views: 47
CNN 5 - Pooling from scratch
Dataset:
The dataset isn't deep-learning-compatible by default, here's how to preprocess it:
Today we'll implement pooling from scratch in pure Python and Numpy
Pooling boils down to subseting 2D array into smaller chunks, which should be easy to implement
You'll need only Numpy for now:
Let's declare a simple and small 2D array that will represent an output from a convolutional layer:
To start with pooling, you'll have to select values for two hyperparameters:
pool_size
- A size of the single region that slides over the imagestride
- The number of pixels you want the region to move as it goes over the image
Common sizes are 2x2 for the pool size, and 2 for the stride
Choosing these value will reduce the convolutional output size by half!
Pool size of 2x2 and a stride of 1 will reduce the image size by a single pixel, which doesn't make much sense
Extract pools from a 2D array
Let's first take care of extracting individual pools
Matrices of shape (pool size, pool size)
Pool size = 2
Stride = 2
Simple, right?
Let's see what happens if we change the stride value to 1
We'll keep everything else as is:
We now get much more pools, which isn't what we want
You can't go wrong by starting with the pool size of 2 and stride of 2
Let's now put all of this in a single function:
MaxPooling from scratch
MaxPooling is the most common pooling type
Basically, it keeps only the largest value from a single pool
There are other types of pooling, such as AveragePooling
It's used much less in practice
To implement it, replace
np.max()
withnp.mean()
MaxPooling logic
Get the total number of pools - length of the
pools
matrix (orshape[0]
)Calculate target shape - image size after performing the pooling operation
Calculted as: Square root of the number of pools casted as integer
Why? We need a rectangular matrix
If
num_pools
is 16, we need a 4x4 matrix (sqrt(16) = 4)
Iterate over all pools and calculate the max - append the max a result list
Return the result list as a Numpy array reshaped to the target shape
Let's test it out:
Works like a charm!
Let's implement pooling on a real image next
Implement pooling on a real image
Let's import PIL and Matplotlib to make working with images easier
We'll declare two helper functions for visualizing single image, and two images side by side:
Let's load a sample image from our dataset
We'll pretend it is an output from a convolutional layer
It doesn't matter actually, pooling doesn't know we're faking it
To make calculations easier, we'll grayscale the image and resize it to 224x224
That's a common practice with neural networks:
Let's get the pools next
Remember to convert the image to a Numpy array
We'll stick with a pool size of 2 and stride of 2
Let's see how many pools we have in total:
So we have 12544 pools, each being a small 2x2 matrix
Square root of 12544 is 112, which means our image will be of size 112x112 pixels after the pooling operation
Let's do the pooling:
Quickly verify the shape:
Everything looks right, but let's also visualize the cat image before and after pooling
We shouldn't have any problems recognizing a cat:
Note: The image on the right is displayed in same figure size as the image on the left, even though it's smaller - check X and Y axis values
It's still a cat, so we can verify the pooling worked
How do we know if we did everything correctly?
We can apply TensorFlow's pooling layer to the cat image and compare the matrices
Verification - Pooling with TensorFlow
Let's import TensorFlow to verify we calculated everything correctly:
We'll declare a Sequential model that has only a MaxPool2D layer
Note the parameters:
Pool size = 2
Strides = 2
Just as we had during the manual calculation
We don't have to train the model
Before passing in the image, we need to reshape it
Batch size, width, height, number of color channels
We can now use the
predict()
function to apply the poolingIt will return a 1x12x12x1 tensor, so we'll reshape it to 112x112:
The matrix does look familiar
We can now use the
array_equal()
function from Numpy to test if our array equals to TensorFlow's "prediction":
And it does, which means we did everything correctly!
You now know how to implement convolutions and pooling from scratch
There's no need to ever do that, but it's good to know
The next notebook will cover building a more robust image classifier with TensorFlow