CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
Path: blob/main/013_CNN_006_Increasing_Model_Complexity.ipynb
Views: 47
CNN 6 - Do Larger Model Lead to Better Performance?
Dataset:
The dataset isn't deep-learning-compatible by default, here's how to preprocess it:
What you should know by now:
How to preprocess image data
How to load image data from a directory
What's a convolution, pooling, and a fully-connected layer
Categorical vs. binary classification
First things first, let's import the libraries
The models we'll declare today will have more layers than the ones before
We'll implement individual classes from TensorFlow
I'm using Nvidia RTX 3060 TI
Load in the data
Use
ImageDataGenerator
to convert image matrices to 0-1 rangeLoad in the images from directories and convert them to 224x224x3
For memory concerns, we'll lower the batch size:
Model 1
Block 1: Conv, Conv, Pool
Block 2: Conv, Conv, Pool
Block 3: Flatten, Dense
Output
We won't mess with the hyperparameters today
Not bad, but we got 75% accuracy on the validation set in notebook 010
Will adding complexity to the model increase the accuracy?
Model 2
Block 1: Conv, Conv, Pool
Block 2: Conv, Conv, Pool
Block 3: Conv, Conv, Pool
Block 4: Flatten, Dense
Ouput
This artchitecture is a bit of an overkill for our dataset
The model isn't learning at all:
When that happens, you can try experimenting with the learning rate and other parameters
Let's dial it down a bit next
Model 3
Block 1: Conv, Conv, Pool
Block 2: Conv, Conv, Pool
Block 3: Flatten, Dense, Dropout, Dense
Output
The first model was better than the second
We can try adding a dropout layer as a regulizer and tweaking the fully connected layers:
It made the model worse
More complex model don't necessarily lead to an increase in performance
Conclusion
There you have it - we've been focusing on the wrong thing from the start
Our model architecture in the notebook 010 was solid
Adding more layers and complexity decreases the predictive power
We should shift our focus to improving the dataset quality
The following notebook will teach you all about data augmentation, and you'll see how it increases the power of our model
After that you'll take your models to new heights with transfer learning, and you'll see why coming up with custom architectures is a waste of time in most cases