Integrative analysis: Neural Network (NN) classification

In this example, we will showcase RamanSPy’s integrability by integrating a Neural Network (NN) model for the identification of different bacteria species.

To build the model, we will use the tensorflow Python framework, but similar integrative analyses are possible with the rest of the Python machine learning and deep learning ecosystem.

The data we will use is the Bacteria data, which is integrated into RamanSPy.

import tensorflow as tf
import numpy as np
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.utils import shuffle
import seaborn as sns
import matplotlib.pyplot as plt

import ramanspy

First, we will use RamanSPy to load the validation and testing bacteria datasets.

dir_ = r"../../../../data/bacteria_data"

X_train, y_train = ramanspy.datasets.bacteria("val", folder=dir_)
X_test, y_test = ramanspy.datasets.bacteria("test", folder=dir_)

Shuffling the dataset we will use to train the model.

X_train, y_train = shuffle(X_train.flat.spectral_data, y_train)

Then, we construct the CNN model.

class NN(tf.keras.Model):
    def __init__(self, input_dim, output_dim):
        super().__init__()

        self.nn = tf.keras.models.Sequential()
        self.nn.add(tf.keras.Input(shape=(input_dim,)))
        self.nn.add(tf.keras.layers.Dense(output_dim, activation='softmax'))

    def call(self, x):
        return self.nn(x)

Initialising the model instance

learning_rate = 0.001
batch_size = 32
epochs = 15
input_dim = X_train.shape[-1]
output_dim = len(np.unique(y_train))

opt = tf.keras.optimizers.Adam(learning_rate=learning_rate)

model = NN(input_dim, output_dim)
model.compile(opt, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.

Training the MLP model on the training dataset.

history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=1)

Epoch 1/15

 1/94 [..............................] - ETA: 9s - loss: 3.5485 - accuracy: 0.0000e+00
94/94 [==============================] - 0s 424us/step - loss: 3.2961 - accuracy: 0.1080
Epoch 2/15

 1/94 [..............................] - ETA: 0s - loss: 2.9558 - accuracy: 0.2188
94/94 [==============================] - 0s 374us/step - loss: 2.8502 - accuracy: 0.2710
Epoch 3/15

 1/94 [..............................] - ETA: 0s - loss: 2.5721 - accuracy: 0.3125
94/94 [==============================] - 0s 370us/step - loss: 2.5477 - accuracy: 0.3453
Epoch 4/15

 1/94 [..............................] - ETA: 0s - loss: 2.4384 - accuracy: 0.3438
94/94 [==============================] - 0s 383us/step - loss: 2.3145 - accuracy: 0.4487
Epoch 5/15

 1/94 [..............................] - ETA: 0s - loss: 2.3003 - accuracy: 0.3125
94/94 [==============================] - 0s 377us/step - loss: 2.1487 - accuracy: 0.4747
Epoch 6/15

 1/94 [..............................] - ETA: 0s - loss: 2.2070 - accuracy: 0.4688
94/94 [==============================] - 0s 354us/step - loss: 1.9886 - accuracy: 0.5560
Epoch 7/15

 1/94 [..............................] - ETA: 0s - loss: 1.7524 - accuracy: 0.7500
94/94 [==============================] - 0s 355us/step - loss: 1.8573 - accuracy: 0.6170
Epoch 8/15

 1/94 [..............................] - ETA: 0s - loss: 1.7691 - accuracy: 0.6562
94/94 [==============================] - 0s 397us/step - loss: 1.7286 - accuracy: 0.6883
Epoch 9/15

 1/94 [..............................] - ETA: 0s - loss: 1.6185 - accuracy: 0.6562
94/94 [==============================] - 0s 384us/step - loss: 1.6281 - accuracy: 0.7130
Epoch 10/15

 1/94 [..............................] - ETA: 0s - loss: 1.6861 - accuracy: 0.6875
94/94 [==============================] - 0s 392us/step - loss: 1.5368 - accuracy: 0.7430
Epoch 11/15

 1/94 [..............................] - ETA: 0s - loss: 1.4050 - accuracy: 0.7188
94/94 [==============================] - 0s 373us/step - loss: 1.4515 - accuracy: 0.7687
Epoch 12/15

 1/94 [..............................] - ETA: 0s - loss: 1.4901 - accuracy: 0.9062
94/94 [==============================] - 0s 370us/step - loss: 1.3789 - accuracy: 0.7897
Epoch 13/15

 1/94 [..............................] - ETA: 0s - loss: 1.3397 - accuracy: 0.7812
94/94 [==============================] - 0s 365us/step - loss: 1.2948 - accuracy: 0.8387
Epoch 14/15

 1/94 [..............................] - ETA: 0s - loss: 1.4259 - accuracy: 0.7812
94/94 [==============================] - 0s 387us/step - loss: 1.2396 - accuracy: 0.8293
Epoch 15/15

 1/94 [..............................] - ETA: 0s - loss: 1.0361 - accuracy: 0.9375
94/94 [==============================] - 0s 356us/step - loss: 1.1803 - accuracy: 0.8403

Testing the trained model on the unseen testing dataset.

y_pred = model.predict(X_test.flat.spectral_data)
y_pred = np.argmax(y_pred, axis=1)

print(f"The accuracy of the NN model is: {accuracy_score(y_pred, y_test)}")

 1/94 [..............................] - ETA: 1s
94/94 [==============================] - 0s 240us/step
The accuracy of the NN model is: 0.675

Confusion matrix:

cf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(cf_matrix, annot=True)
plt.show()

Accuracy profile:

plt.plot(history.history['accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

Loss profile:

plt.plot(history.history['loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

Total running time of the script: ( 0 minutes 3.658 seconds)