Integrative analysis: Neural Network (NN) classification

In this example, we will showcase RamanSPy’s integrability by integrating a Neural Network (NN) model for the identification of different bacteria species.

To build the model, we will use the tensorflow Python framework, but similar integrative analyses are possible with the rest of the Python machine learning and deep learning ecosystem.

The data we will use is the Bacteria data, which is integrated into RamanSPy.

import tensorflow as tf
import numpy as np
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.utils import shuffle
import seaborn as sns
import matplotlib.pyplot as plt

import ramanspy

First, we will use RamanSPy to load the validation and testing bacteria datasets.

dir_ = r"../../../../data/bacteria_data"

X_train, y_train = ramanspy.datasets.bacteria("val", folder=dir_)
X_test, y_test = ramanspy.datasets.bacteria("test", folder=dir_)

Shuffling the dataset we will use to train the model.

X_train, y_train = shuffle(X_train.flat.spectral_data, y_train)

Then, we construct the CNN model.

class NN(tf.keras.Model):
    def __init__(self, input_dim, output_dim):
        super().__init__()

        self.nn = tf.keras.models.Sequential()
        self.nn.add(tf.keras.Input(shape=(input_dim,)))
        self.nn.add(tf.keras.layers.Dense(output_dim, activation='softmax'))

    def call(self, x):
        return self.nn(x)

Initialising the model instance

learning_rate = 0.001
batch_size = 32
epochs = 15
input_dim = X_train.shape[-1]
output_dim = len(np.unique(y_train))

opt = tf.keras.optimizers.Adam(learning_rate=learning_rate)

model = NN(input_dim, output_dim)
model.compile(opt, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.

Training the MLP model on the training dataset.

history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=1)
Epoch 1/15

 1/94 [..............................] - ETA: 9s - loss: 3.5485 - accuracy: 0.0000e+00
94/94 [==============================] - 0s 424us/step - loss: 3.2961 - accuracy: 0.1080
Epoch 2/15

 1/94 [..............................] - ETA: 0s - loss: 2.9558 - accuracy: 0.2188
94/94 [==============================] - 0s 374us/step - loss: 2.8502 - accuracy: 0.2710
Epoch 3/15

 1/94 [..............................] - ETA: 0s - loss: 2.5721 - accuracy: 0.3125
94/94 [==============================] - 0s 370us/step - loss: 2.5477 - accuracy: 0.3453
Epoch 4/15

 1/94 [..............................] - ETA: 0s - loss: 2.4384 - accuracy: 0.3438
94/94 [==============================] - 0s 383us/step - loss: 2.3145 - accuracy: 0.4487
Epoch 5/15

 1/94 [..............................] - ETA: 0s - loss: 2.3003 - accuracy: 0.3125
94/94 [==============================] - 0s 377us/step - loss: 2.1487 - accuracy: 0.4747
Epoch 6/15

 1/94 [..............................] - ETA: 0s - loss: 2.2070 - accuracy: 0.4688
94/94 [==============================] - 0s 354us/step - loss: 1.9886 - accuracy: 0.5560
Epoch 7/15

 1/94 [..............................] - ETA: 0s - loss: 1.7524 - accuracy: 0.7500
94/94 [==============================] - 0s 355us/step - loss: 1.8573 - accuracy: 0.6170
Epoch 8/15

 1/94 [..............................] - ETA: 0s - loss: 1.7691 - accuracy: 0.6562
94/94 [==============================] - 0s 397us/step - loss: 1.7286 - accuracy: 0.6883
Epoch 9/15

 1/94 [..............................] - ETA: 0s - loss: 1.6185 - accuracy: 0.6562
94/94 [==============================] - 0s 384us/step - loss: 1.6281 - accuracy: 0.7130
Epoch 10/15

 1/94 [..............................] - ETA: 0s - loss: 1.6861 - accuracy: 0.6875
94/94 [==============================] - 0s 392us/step - loss: 1.5368 - accuracy: 0.7430
Epoch 11/15

 1/94 [..............................] - ETA: 0s - loss: 1.4050 - accuracy: 0.7188
94/94 [==============================] - 0s 373us/step - loss: 1.4515 - accuracy: 0.7687
Epoch 12/15

 1/94 [..............................] - ETA: 0s - loss: 1.4901 - accuracy: 0.9062
94/94 [==============================] - 0s 370us/step - loss: 1.3789 - accuracy: 0.7897
Epoch 13/15

 1/94 [..............................] - ETA: 0s - loss: 1.3397 - accuracy: 0.7812
94/94 [==============================] - 0s 365us/step - loss: 1.2948 - accuracy: 0.8387
Epoch 14/15

 1/94 [..............................] - ETA: 0s - loss: 1.4259 - accuracy: 0.7812
94/94 [==============================] - 0s 387us/step - loss: 1.2396 - accuracy: 0.8293
Epoch 15/15

 1/94 [..............................] - ETA: 0s - loss: 1.0361 - accuracy: 0.9375
94/94 [==============================] - 0s 356us/step - loss: 1.1803 - accuracy: 0.8403

Testing the trained model on the unseen testing dataset.

y_pred = model.predict(X_test.flat.spectral_data)
y_pred = np.argmax(y_pred, axis=1)

print(f"The accuracy of the NN model is: {accuracy_score(y_pred, y_test)}")
 1/94 [..............................] - ETA: 1s
94/94 [==============================] - 0s 240us/step
The accuracy of the NN model is: 0.675

Confusion matrix:

cf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(cf_matrix, annot=True)
plt.show()
plot v integrative nn

Accuracy profile:

plt.plot(history.history['accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()
Model accuracy

Loss profile:

plt.plot(history.history['loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()
Model loss

Total running time of the script: ( 0 minutes 3.658 seconds)