Note
Go to the end to download the full example code
Integrative analysis: Neural Network (NN) classification
In this example, we will showcase RamanSPy’s integrability by integrating a Neural Network (NN) model for the identification of different bacteria species.
To build the model, we will use the tensorflow Python framework, but similar integrative analyses are possible with the rest of the Python machine learning and deep learning ecosystem.
The data we will use is the Bacteria data, which is integrated into RamanSPy.
import tensorflow as tf
import numpy as np
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.utils import shuffle
import seaborn as sns
import matplotlib.pyplot as plt
import ramanspy
First, we will use RamanSPy to load the validation and testing bacteria datasets.
dir_ = r"../../../../data/bacteria_data"
X_train, y_train = ramanspy.datasets.bacteria("val", folder=dir_)
X_test, y_test = ramanspy.datasets.bacteria("test", folder=dir_)
Shuffling the dataset we will use to train the model.
X_train, y_train = shuffle(X_train.flat.spectral_data, y_train)
Then, we construct the CNN model.
class NN(tf.keras.Model):
def __init__(self, input_dim, output_dim):
super().__init__()
self.nn = tf.keras.models.Sequential()
self.nn.add(tf.keras.Input(shape=(input_dim,)))
self.nn.add(tf.keras.layers.Dense(output_dim, activation='softmax'))
def call(self, x):
return self.nn(x)
Initialising the model instance
learning_rate = 0.001
batch_size = 32
epochs = 15
input_dim = X_train.shape[-1]
output_dim = len(np.unique(y_train))
opt = tf.keras.optimizers.Adam(learning_rate=learning_rate)
model = NN(input_dim, output_dim)
model.compile(opt, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
Training the MLP model on the training dataset.
history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=1)
Epoch 1/15
1/94 [..............................] - ETA: 9s - loss: 3.5485 - accuracy: 0.0000e+00
94/94 [==============================] - 0s 424us/step - loss: 3.2961 - accuracy: 0.1080
Epoch 2/15
1/94 [..............................] - ETA: 0s - loss: 2.9558 - accuracy: 0.2188
94/94 [==============================] - 0s 374us/step - loss: 2.8502 - accuracy: 0.2710
Epoch 3/15
1/94 [..............................] - ETA: 0s - loss: 2.5721 - accuracy: 0.3125
94/94 [==============================] - 0s 370us/step - loss: 2.5477 - accuracy: 0.3453
Epoch 4/15
1/94 [..............................] - ETA: 0s - loss: 2.4384 - accuracy: 0.3438
94/94 [==============================] - 0s 383us/step - loss: 2.3145 - accuracy: 0.4487
Epoch 5/15
1/94 [..............................] - ETA: 0s - loss: 2.3003 - accuracy: 0.3125
94/94 [==============================] - 0s 377us/step - loss: 2.1487 - accuracy: 0.4747
Epoch 6/15
1/94 [..............................] - ETA: 0s - loss: 2.2070 - accuracy: 0.4688
94/94 [==============================] - 0s 354us/step - loss: 1.9886 - accuracy: 0.5560
Epoch 7/15
1/94 [..............................] - ETA: 0s - loss: 1.7524 - accuracy: 0.7500
94/94 [==============================] - 0s 355us/step - loss: 1.8573 - accuracy: 0.6170
Epoch 8/15
1/94 [..............................] - ETA: 0s - loss: 1.7691 - accuracy: 0.6562
94/94 [==============================] - 0s 397us/step - loss: 1.7286 - accuracy: 0.6883
Epoch 9/15
1/94 [..............................] - ETA: 0s - loss: 1.6185 - accuracy: 0.6562
94/94 [==============================] - 0s 384us/step - loss: 1.6281 - accuracy: 0.7130
Epoch 10/15
1/94 [..............................] - ETA: 0s - loss: 1.6861 - accuracy: 0.6875
94/94 [==============================] - 0s 392us/step - loss: 1.5368 - accuracy: 0.7430
Epoch 11/15
1/94 [..............................] - ETA: 0s - loss: 1.4050 - accuracy: 0.7188
94/94 [==============================] - 0s 373us/step - loss: 1.4515 - accuracy: 0.7687
Epoch 12/15
1/94 [..............................] - ETA: 0s - loss: 1.4901 - accuracy: 0.9062
94/94 [==============================] - 0s 370us/step - loss: 1.3789 - accuracy: 0.7897
Epoch 13/15
1/94 [..............................] - ETA: 0s - loss: 1.3397 - accuracy: 0.7812
94/94 [==============================] - 0s 365us/step - loss: 1.2948 - accuracy: 0.8387
Epoch 14/15
1/94 [..............................] - ETA: 0s - loss: 1.4259 - accuracy: 0.7812
94/94 [==============================] - 0s 387us/step - loss: 1.2396 - accuracy: 0.8293
Epoch 15/15
1/94 [..............................] - ETA: 0s - loss: 1.0361 - accuracy: 0.9375
94/94 [==============================] - 0s 356us/step - loss: 1.1803 - accuracy: 0.8403
Testing the trained model on the unseen testing dataset.
y_pred = model.predict(X_test.flat.spectral_data)
y_pred = np.argmax(y_pred, axis=1)
print(f"The accuracy of the NN model is: {accuracy_score(y_pred, y_test)}")
1/94 [..............................] - ETA: 1s
94/94 [==============================] - 0s 240us/step
The accuracy of the NN model is: 0.675
Confusion matrix:
cf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(cf_matrix, annot=True)
plt.show()

Accuracy profile:
plt.plot(history.history['accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

Loss profile:
plt.plot(history.history['loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

Total running time of the script: ( 0 minutes 3.658 seconds)