Image Classification using Convolutional neural networks¶

  • We have been tasked to apply Convolutional neural networks to clasify any sample image data of our choice purposely for image recognition
  • We are going to use Covid-19 image Dataset from kaggle to try and detect covid-19 using chest X-rays.
  • Comparisons can be done between normal, covid-19 and pneumonia chest X-rays where relvant and possible
  • Datase from Kaggle : (https://www.kaggle.com/datasets/pranavraikokte/covid19-image-dataset)
  • pip install opencv-python
In [167]:
#suppress warnings for a clean notebook just to moderate error messages
import warnings
warnings.filterwarnings('ignore')
In [168]:
import tensorflow as tf
import os
from tensorflow import keras
from tensorflow.keras.models import Sequential

import cv2  # Import the OpenCV library
from sklearn.model_selection import train_test_split

#Conv2D for convolutional layer
#Maxpooling2D for pooling layer
#Dense for fully connected layer 
#Flatten to convert or flatten our multiD vectors to a singleD vectors
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
import matplotlib.pyplot as plt
import numpy as np


from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator

#Below metrics for testing and validating our models
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

Load Dataset¶

In [169]:
 
X_train = os.listdir(r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\train\Covid")
y_train = os.listdir(r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\train\Normal")

X_test = os.listdir(r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\test\Covid")
y_test=os.listdir(r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\test\Normal")

View items on the directories of interest¶

In [170]:
print("Files in the 'Covid' directory:")
for file in X_train:
    print(file)
Files in the 'Covid' directory:
01.jpeg
010.png
012.jpeg
015.jpg
019.png
02.jpeg
020.jpg
021.jpeg
022.jpeg
024.jpeg
025.jpeg
026.jpeg
027.jpeg
03.jpeg
031.jpeg
032.jpeg
033.png
039.jpeg
04.png
040.jpeg
041.jpeg
042.jpeg
043.jpeg
044.jpeg
045.jpeg
046.jpeg
047.jpeg
048.jpeg
049.jpeg
050.jpeg
051.jpeg
052.jpeg
053.jpeg
054.jpeg
055.jpeg
056.jpg
057.jpeg
058.jpeg
059.jpeg
06.jpeg
060.jpeg
061.jpg
062.jpeg
064.jpg
065.jpeg
067.jpg
068.jpg
069.jpg
07.jpg
071.jpg
072.jpeg
073.jpg
074.jpg
076.jpg
078.jpeg
079.jpeg
08.jpeg
080.jpg
081.jpeg
082.jpg
083.jpeg
084.jpeg
085.jpeg
086.jpg
088.jpeg
089.jpg
09.png
090.jpeg
091.jpg
092.png
COVID-00001.jpg
COVID-00002.jpg
COVID-00003a.jpg
COVID-00003b.jpg
COVID-00004.jpg
COVID-00005.jpg
COVID-00006.jpg
COVID-00007.jpg
COVID-00008.jpg
COVID-00009.jpg
COVID-00010.jpg
COVID-00011.jpg
COVID-00012.jpg
COVID-00013a.jpg
COVID-00013b.jpg
COVID-00014.jpg
COVID-00015a.png
COVID-00015b.png
COVID-00016.jpg
COVID-00017.jpg
COVID-00018.jpg
COVID-00019.jpg
COVID-00020.jpg
COVID-00021.jpg
COVID-00022.jpg
COVID-00023.jpg
COVID-00024.jpg
COVID-00025.jpg
COVID-00026.jpg
COVID-00027.jpg
COVID-00028.jpg
COVID-00029.jpg
COVID-00030.jpg
COVID-00031.jpg
COVID-00032.jpg
COVID-00033.jpg
COVID-00034.jpg
COVID-00035.jpg
COVID-00036.jpg
COVID-00037.jpg
COVID-00038.jpg
In [171]:
print("Files in the 'Normal' directory:")
for file in y_train:
    print(file)
Files in the 'Normal' directory:
01.jpeg
010.jpeg
011.jpeg
012.jpeg
013.jpeg
014.jpeg
015.jpeg
016.jpeg
017.jpeg
018.jpeg
019.jpeg
02.jpeg
020.jpeg
021.jpeg
022.jpeg
023.jpeg
024.jpeg
025.jpeg
03.jpeg
04.jpeg
05.jpeg
050.jpeg
051.jpeg
052.jpeg
053.jpeg
054.jpeg
055.jpeg
056.jpeg
057.jpeg
058.jpeg
059.jpeg
06.jpeg
060.jpeg
061.jpeg
062.jpeg
063.jpeg
064.jpeg
065.jpeg
066.jpeg
067.jpeg
068.jpeg
069.jpeg
07.jpeg
070.jpeg
071.jpeg
072.jpeg
073.jpeg
074.jpeg
075.jpeg
076.jpeg
077.jpeg
079.jpeg
08.jpeg
080.jpeg
081.jpeg
082.jpeg
083.jpeg
084.jpeg
085.jpeg
086.jpeg
087.jpeg
088.jpeg
09.jpeg
091.jpeg
092.jpeg
093.jpeg
094.jpeg
095.jpeg
096.jpeg
097.jpeg
In [172]:
print("Files in the 'Covid' directory:")
for file in X_test:
    print(file)
Files in the 'Covid' directory:
0100.jpeg
0102.jpeg
0105.png
0106.jpeg
0108.jpeg
0111.jpg
0112.jpg
0113.jpg
0115.jpeg
0118.jpeg
0119.jpeg
0120.jpg
094.png
096.png
098.jpeg
auntminnie-2020_01_31_20_24_2322_2020_01_31_x-ray_coronavirus_US.jpg
auntminnie-a-2020_01_28_23_51_6665_2020_01_28_Vietnam_coronavirus.jpeg
auntminnie-b-2020_01_28_23_51_6665_2020_01_28_Vietnam_coronavirus.jpeg
auntminnie-c-2020_01_28_23_51_6665_2020_01_28_Vietnam_coronavirus.jpeg
auntminnie-d-2020_01_28_23_51_6665_2020_01_28_Vietnam_coronavirus.jpeg
COVID-00003b.jpg
COVID-00012.jpg
COVID-00022.jpg
COVID-00033.jpg
COVID-00037.jpg
radiopaedia-2019-novel-coronavirus-infected-pneumonia.jpg
In [173]:
print("Files in the 'Normal' directory:")
for file in y_test:
    print(file)
Files in the 'Normal' directory:
0101.jpeg
0102.jpeg
0103.jpeg
0105.jpeg
0106.jpeg
0107.jpeg
0108.jpeg
0109.jpeg
0110.jpeg
0111.jpeg
0112.jpeg
0114.jpeg
0115.jpeg
0116.jpeg
0117.jpeg
0118.jpeg
0119.jpeg
0120.jpeg
0121.jpeg
0122.jpeg
In [174]:
X_test[0]
Out[174]:
'0100.jpeg'
In [175]:
# Display the image
#plt.imshow(X_test[0])
#plt.title('CNN Image')
#plt.show()

Load and preprocess the data:¶

  • We will need to load the image data, resize it to a consistent size, and normalize the pixel values.
In [177]:
# Initialization of our data of interest

X_train = []
y_train = []

X_test = [] 
y_test = [] 

# Define raw string literals by adding 'r' prefix
train_covid_dir = r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\train\Covid"
train_normal_dir = r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\train\Normal"
test_covid_dir = r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\test\Covid"
test_normal_dir = r"C:\Users\PC\Desktop\KIbe\sem 2\Unstructured data analytics & apps\jupkibe\data\Covid19-dataset\test\Normal"

# Load and preprocess training data
for filename in os.listdir(train_covid_dir):
    img = cv2.imread(os.path.join(train_covid_dir, filename))
    img = cv2.resize(img, (224, 224))  # Resize to a common size
    img = img / 255.0  # Normalize pixel values to [0, 1]
    X_train.append(img)
    y_train.append([1, 0])  # COVID label

# Load and preprocess training data for the Normal class
for filename in os.listdir(train_normal_dir):
    img = cv2.imread(os.path.join(train_normal_dir, filename))
    img = cv2.resize(img, (224, 224))
    img = img / 255.0
    X_train.append(img)
    y_train.append([0, 1])  # Normal label

# Load and preprocess test data in a similar way
for filename in os.listdir(test_covid_dir):
    img = cv2.imread(os.path.join(test_covid_dir, filename))
    img = cv2.resize(img, (224, 224))
    img = img / 255.0
    X_test.append(img)
    y_test.append([1, 0])  # COVID label

for filename in os.listdir(test_normal_dir):
    img = cv2.imread(os.path.join(test_normal_dir, filename))
    img = cv2.resize(img, (224, 224))
    img = img / 255.0
    X_test.append(img)
    y_test.append([0, 1])  # Normal label

X_train = np.array(X_train)
y_train = np.array(y_train)
X_test = np.array(X_test)
y_test = np.array(y_test)

Split the data into training and validation sets:¶

In [178]:
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)

Here define the CNN model:¶

  • We now can define a simple CNN architecture using TensorFlow/Keras.
In [179]:
model = keras.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)),
    layers.MaxPooling2D(2, 2),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D(2, 2),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(2, activation='softmax')  # 2 output classes (COVID and Normal)
])

Here we now compile the model:¶

In [180]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
In [181]:
#Data Augmentation although only necessary for small data sets and sure our data set is quite small
datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    zoom_range=0.2,
)

datagen.fit(X_train)

Here we now train the model:¶

In [182]:
history = model.fit(
    datagen.flow(X_train, y_train, batch_size=32),
    validation_data=(X_val, y_val),
    epochs=10,  # You can adjust the number of epochs
    verbose=2
)
Epoch 1/10
5/5 - 14s - loss: 5.7209 - accuracy: 0.5417 - val_loss: 1.5176 - val_accuracy: 0.5946 - 14s/epoch - 3s/step
Epoch 2/10
5/5 - 12s - loss: 0.7059 - accuracy: 0.5556 - val_loss: 0.7327 - val_accuracy: 0.4054 - 12s/epoch - 2s/step
Epoch 3/10
5/5 - 12s - loss: 0.6611 - accuracy: 0.6181 - val_loss: 0.5744 - val_accuracy: 0.6486 - 12s/epoch - 2s/step
Epoch 4/10
5/5 - 12s - loss: 0.5520 - accuracy: 0.7639 - val_loss: 0.5294 - val_accuracy: 0.6757 - 12s/epoch - 2s/step
Epoch 5/10
5/5 - 12s - loss: 0.4662 - accuracy: 0.8056 - val_loss: 0.4746 - val_accuracy: 0.7297 - 12s/epoch - 2s/step
Epoch 6/10
5/5 - 12s - loss: 0.4056 - accuracy: 0.7917 - val_loss: 0.3087 - val_accuracy: 0.9189 - 12s/epoch - 2s/step
Epoch 7/10
5/5 - 13s - loss: 0.3191 - accuracy: 0.9028 - val_loss: 0.2549 - val_accuracy: 0.9189 - 13s/epoch - 3s/step
Epoch 8/10
5/5 - 13s - loss: 0.2706 - accuracy: 0.8750 - val_loss: 0.2414 - val_accuracy: 0.9189 - 13s/epoch - 3s/step
Epoch 9/10
5/5 - 13s - loss: 0.2591 - accuracy: 0.9028 - val_loss: 0.1814 - val_accuracy: 0.9189 - 13s/epoch - 3s/step
Epoch 10/10
5/5 - 13s - loss: 0.2573 - accuracy: 0.8889 - val_loss: 0.2145 - val_accuracy: 0.9189 - 13s/epoch - 3s/step

Model Evaluation on the test set:¶

In [185]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f'Test accuracy: {test_accuracy}')
2/2 [==============================] - 1s 380ms/step - loss: 0.0853 - accuracy: 1.0000
Test accuracy: 1.0
In [204]:
# Make predictions on the test data
predictions = model.predict(X_test)
predicted_classes = predictions.argmax(axis=1)

# we have 46 images for normal and covid lets try to detect if those X-rays are either normal or covid infested
# true = 1
# False = 0
# prediction for a specific image (e.g., the first image in the test set)
print("Predictions 0:NO or 1:YES :", predicted_classes[38])
2/2 [==============================] - 1s 319ms/step
Predictions 0:NO or 1:YES : 1
In [ ]: