Image classifier using CNNs

Manav Mandal
7 min readAug 23, 2021

--

Cats vs Dogs

Hey Guys,

If you have checked my previous blogs about Convolutional Neural Networks (Convolutional Neural Networks (CNN) | by Manav Mandal | Medium) you might have an idea about how powerful the tool is and its vast applications. With CNN’s we can truly understand what it is in an image when you look at it. This blog is a tutorial to build your (maybe)first Image Classifier with CNN’s and TensorFlow.

In this tutorial, we will train our model using few thousand images to build our Image Classifier and what better choice than the images of cats and dogs. In this process we will develop our intuition about the workflow and the following concepts:

Examine and understand data

Build an input pipeline

Build our model

Train our model

Test our model

Improve our model/Repeat the process

I strongly recommend using Google Colabs and setting the runtime to GPU(Go to Runtime -> Change Runtime type -> GPU). So without any further delay let’s get started with the tutorial.

Importing Packages

import os
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

Let’s start by downloading the dataset. The dataset we are using is a filtered version of Dogs vs. Cats dataset from Kaggle (ultimately, this dataset is provided by Microsoft Research).

TensorFlow Datasets is a beginner-friendly tool for using datasets. In addition to TensorFlow Datasets, we will also use the class tf.keras.preprocessing.image.ImageDataGenerator which will read data from disk. Let’s go ahead and download the dataset from the URL and unzip it inside colab.

_URL = ‘https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'zip_dir = tf.keras.utils.get_file(‘cats_and_dogs_filterted.zip’, origin=_URL, extract=True)

After downloading the dataset the file structure should look as follows:

cats_and_dogs_filtered
|__ train
|______ cats: [cat.0.jpg, cat.1.jpg, cat.2.jpg ....]
|______ dogs: [dog.0.jpg, dog.1.jpg, dog.2.jpg ...]
|__ validation
|______ cats: [cat.2000.jpg, cat.2001.jpg, cat.2002.jpg ....]
|______ dogs: [dog.2000.jpg, dog.2001.jpg, dog.2002.jpg ...]

So lets go ahead and divide the dataset into training and validation data frames.

base_dir = os.path.join(os.path.dirname(zip_dir), ‘cats_and_dogs_filtered’)
train_dir = os.path.join(base_dir, ‘train’)
validation_dir = os.path.join(base_dir, ‘validation’)

train_cats_dir = os.path.join(train_dir, ‘cats’) # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, ‘dogs’) # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, ‘cats’) # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, ‘dogs’) # directory with our validation dog pictures

Now that we have our data lets look around and examine the data

num_cats_tr = len(os.listdir(train_cats_dir))num_dogs_tr = len(os.listdir(train_dogs_dir))num_cats_val = len(os.listdir(validation_cats_dir))num_dogs_val = len(os.listdir(validation_dogs_dir))total_train = num_cats_tr + num_dogs_trtotal_val = num_cats_val + num_dogs_valprint('total training cat images:', num_cats_tr)print('total training dog images:', num_dogs_tr)print('total validation cat images:', num_cats_val)print('total validation dog images:', num_dogs_val)print("--")print("Total training images:", total_train)print("Total validation images:", total_val)

Before going ahead let us set up some variables that will be used down the road to preprocess our data for training the model.

BATCH_SIZE = 100IMG_SHAPE  = 150

The batch size is the number of samples processed before the model is updated. Our training data consists of images with a width of 150 pixels and a height of 150 pixels, hence we will set the IMG_SHAPE to 150.

The flow_from_directory method will load images from the disk and will apply various transformations using a single line of code.

An important part while building any image classifier is Data Augmentation. Our model should be able to identify an image irrespective of its position, shape, the fact that its a mirror image, in grayscale, etc. Another advantage of Data Augmentation is that you avoid Overfitting.

Overfitting often occurs when we have a small number of training examples. One solution to this problem is to augment our dataset so that it has a sufficient number and variety of training examples. With Data augmentation we generate more data from existing training samples, by applying random transformations that yield believable-looking images. Our end goal is that the model shouldn’t see the same picture twice while training. This exposes the model to more aspects of the data, allowing it to generalize better.

In tf.keras we can implement this using the same ImageDataGenerator class we used before. We can simply pass different transformations we would want to our dataset as a form of arguments and it will take care of applying it to the dataset during our training process.

Before we go ahead lets create a function to display an image so that we can visualize the the effects of various augmentations.The following function will plot images in the form of a grid with 1 row and 5 columns where images are placed in each column.

def plotImages(images_arr):fig, axes = plt.subplots(1, 5, figsize=(20,20))axes = axes.flatten()for img, ax in zip(images_arr, axes):ax.imshow(img)plt.tight_layout()plt.show()

Flipping the images horizontally

Lets randomly apply horizontal flip augmentation to our dataset and check how individual images will look after the transformation.We can do this by making horizontal_flip=True as an argument to the ImageDataGenerator class.

image_gen = ImageDataGenerator(rescale=1./255, horizontal_flip=True)train_data_gen = image_gen.flow_from_directory(batch_size=BATCH_SIZE,directory=train_dir,shuffle=True,target_size=(IMG_SHAPE,IMG_SHAPE))

Lets select a random image to see the effects of the above transformation.

augmented_images = [train_data_gen[0][0][0] for i in range(5)]plotImages(augmented_images)

Rotating the image

This augmentation will randomly rotate the image up to a specified number of degrees. Here, we’ll set it to 45.

image_gen = ImageDataGenerator(rescale=1./255, rotation_range=45)train_data_gen=image_gen.flow_from_directory(batch_size=BATCH_SIZE,
directory=train_dir,
shuffle=True,
target_size=(IMG_SHAPE, IMG_SHAPE))

To check out the transformation call the same plotImages() function as we did previously.

Applying Zoom

Here we will apply 50% zoom to the images.

image_gen = ImageDataGenerator(rescale=1./255, zoom_range=0.5)train_data_gen=image_gen.flow_from_directory(batch_size=BATCH_SIZE,
directory=train_dir,
shuffle=True,
target_size=(IMG_SHAPE, IMG_SHAPE))

Calling these functions one by one is tiresome and time consuming so lets put all of it together.

image_gen_train = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
train_data_gen = image_gen_train.flow_from_directory(batch_size=BATCH_SIZE,
directory=train_dir,
shuffle=True,
target_size=(IMG_SHAPE,IMG_SHAPE),
class_mode='binary')

You might have noticed a few additional transformations that have been added, feel free to check out the documentation about what they do and add your own. Finally, let’s look at how the dataset would look after applying the above transformations.

augmented_images = [train_data_gen[0][0][0] for i in range(5)]plotImages(augmented_images)

Creating validation data generator

Normally we use data augmentation only on our training dataset since the training data is representative of what the model needs to do. So we will only add the rescaling transformation to the validation dataset.

image_gen_val = ImageDataGenerator(rescale=1./255)val_data_gen = image_gen_val.flow_from_directory(batch_size=BATCH_SIZE,
directory=validation_dir,
target_size=(IMG_SHAPE, IMG_SHAPE),
class_mode='binary')

Model Creation

So you have your training and validation datasets ready for the model(Bravo!). Lets define the model.

The model consists of four convolution blocks with a max pool layer in each of them. Overfitting can be a pain, so in order to avoid that we will be using a Dropout layer with a probability of 50%. This will set half the values coming to the Dense layer to 0. Further, we have a fully connected layer with 512 units, with a relu activation function. Since our model will be giving probabilities for two classes(Dog or Cat) we will use the softmax function.

model = tf.keras.models.Sequential([tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),tf.keras.layers.MaxPooling2D(2, 2),tf.keras.layers.Conv2D(64, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),tf.keras.layers.Conv2D(128, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),tf.keras.layers.Conv2D(128, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),tf.keras.layers.Dropout(0.5),tf.keras.layers.Flatten(),tf.keras.layers.Dense(512, activation='relu'),tf.keras.layers.Dense(2)])

Compiling the model

Lets use the Adam optimizer and since we have used the Softmax function we will use sparse_categorical_crossentropy as the loss function. Now we would love to look at the training process of the model and check out the training and validation accuracy so lets pass the metrics argument.

model.compile(optimizer='adam',
loss=
tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])

We have created the model so lets take a look at all the layers.

model.summary()

Training the model

It is time to start training our model. Now generally we use the fit method to train the data but since our data comes from a generator (ImageDataGenerator),we will use fit_generator.

(This will take time)

epochs=100history = model.fit_generator(      train_data_gen,
steps_per_epoch=int(np.ceil(total_train / float(BATCH_SIZE))),
epochs=epochs,
validation_data=val_data_gen,
validation_steps=int(np.ceil(total_val / float(BATCH_SIZE)))
)

Yay!!! you did it, you have built an Image Classifier but how do se test this classifier? by visualizing the results.

Visualizing results of the training

acc = history.history['accuracy']val_acc = history.history['val_accuracy']loss = history.history['loss']val_loss = history.history['val_loss']epochs_range = range(epochs)
plt.figure(figsize=(8, 8))plt.subplot(1, 2, 1)plt.plot(epochs_range, acc, label='Training Accuracy')plt.plot(epochs_range, val_acc, label='Validation Accuracy')plt.legend(loc='lower right')plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)plt.plot(epochs_range, loss, label='Training Loss')plt.plot(epochs_range, val_loss, label='Validation Loss')plt.legend(loc='upper right')plt.title('Training and Validation Loss')plt.show()

Hurray!! You made it through! You can now go ahead play with the model adjust the transformations, increase the epochs, call model.predict(), etc. Check out the colab notebook here Cats vs Dogs(CNN’s).ipynb — Colaboratory (google.com)

Thanks for reading! If you enjoyed reading this article, please click the 👏 button and share to help others find it! Feel free to leave a comment 💬 below. You can connect with me on GitHub, LinkedIn.

Have some feedback? Let’s be friends on Twitter.

All the best and happy coding!😀

--

--

Manav Mandal
Manav Mandal

No responses yet