Preprocessing
RamanSPy offers extensive preprocessing support that alleviates the common burdens of spectral preprocessing and enables the construction and execution of complex preprocessing procedures with minimal software requirements.
Users can access and use the variety of preprocessing techniques and protocols built into RamanSPy, as well as use the package to define custom preprocessing algorithms and pipelines.
All of the preprocessing methods and pipelines are implemented as PreprocessingStep instances, which standardises
their behaviour and allows for their easy integration into various preprocessing pipelines.
These have been designed to be as flexible and data-agnostic as possible, so that they can be applied to any type of Raman spectroscopic data loaded into the framework.
Preprocessing support is provided by the ramanspy.preprocessing module.
Algorithms
The behaviour of preprocessing procedures is defined by the PreprocessingStep class.
- class ramanspy.preprocessing.PreprocessingStep(method: Callable, **kwargs)[source]
A class that defines preprocessing logic.
Encapsulate preprocessing methods that transform the intensity values and spectral axis of Raman data.
To define a preprocessing procedure that can be applied to any Raman spectroscopic data, you must wrap a predefined preprocessing method using this class, which in turn streamlines any consecutive operations.
- Parameters:
method (Callable) – A Callable object (e.g. a method) which defines how the preprocessing step alters spectral objects. Its
__call__method must have signature of the form:__call__(intensity_data, spectral_axis, *args, **kwargs), whereintensity_datais an ndarray of arbitrary shape defining the intensity values to process, whose last axis is the spectral axis,spectral_axis- a 1D ndarray defining the spectral axis to process (in Raman wavenumber cm -1 units),*args- other positional arguments, and**kwargs- other keyword arguments.**kwargs – Any keyword arguments the Callable needs in its
__call__method.
Note
One has to use the
PreprocessingStepclass only when devising and integrating custom preprocessing methods (check Custom algorithms).All preprocessing methods built into RamanSPy can be directly accessed and used as indicated in Built-in preprocessing methods.
Example
from ramanspy import preprocessing # Defining some preprocessing function of the correct type def preprocessing_func(intensity_data, spectral_axis, **kwargs): # Preprocess intensity_data and spectral axis ... return updated_intensity_data, updated_spectral_axis # wrapping the function into a PreprocessingStep object together with the relevant *args and **kwargs preprocessing_method = preprocessing.PreprocessingStep(preprocessing_func, **kwargs)
- final apply(raman_objects: SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume]]) SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume]][source]
Applies the defined preprocessing method on the Raman spectroscopic objects provided.
The single point-of-contact method of
ramanspy.preprocessing.PreprocessingStepinstances.Method is applied on each data container instance provided individually.
- Parameters:
raman_objects (Union[SpectralObject, List[Union[SpectralObject, List[SpectralObject]]]]) – The objects to preprocess, where SpectralObject := Union[SpectralContainer, Spectrum, SpectralImage, SpectralVolume].
- Returns:
The preprocessed objects, where SpectralObject := Union[SpectralContainer, Spectrum, SpectralImage, SpectralVolume].
- Return type:
Union[SpectralObject, List[Union[SpectralObject, List[SpectralObject]]]]
Note
When more than one class:ramanspy.SpectralContainer is passed, preprocessing methods are applied individually for each instance passed.
Example
# once a preprocessing method is initialised, it can be applied to different Raman data preprocessed_data = preprocessing_method.apply(raman_object) preprocessed_data = preprocessing_method.apply([raman_object, raman_spectrum, raman_image]) preprocessed_data = preprocessing_method.apply([raman_object, raman_spectrum], raman_object, [raman_spectrum, raman_image])
Built-in preprocessing methods
RamanSPy provides many of the commonly-used techniques for spectral preprocessing. This includes a broad collection of methods for cosmic ray removal, denoising, baseline correction, normalisation and other preprocessing procedures.
These are built into RamanSPy as classes extending PreprocessingStep and can thus be readily used via their
apply() method. The built-in methods include:
Miscellaneous
|
Crop the intensity values and the shift axis associated with the band range(s) specified. |
|
Subtract a fixed reference background. |
Cosmic rays removal
|
Cosmic rays removal based on modified z-scores filtering. |
Denoising
|
Denoising based on Savitzky-Golay filtering. |
|
Denoising based on Discrete Penalised Least Squares (a.k.a Whittaker−Henderson smoothing). |
|
Denoising based on kernel/window smoothers. |
|
Denoising based on a Gaussian filter. |
Baseline correction
Least squares:
|
Baseline correction based on Asymmetric Least Squares (AsLS). |
|
Baseline correction based on Improved Asymmetric Least Squares (IAsLS). |
|
Baseline correction based on Adaptive Iteratively Reweighted Penalized Least Squares (airPLS). |
|
Baseline correction based on Asymmetrically Reweighted Penalized Least Squares (arPLS). |
|
Baseline correction based on Doubly Reweighted Penalized Least Squares (drPLS). |
|
Baseline correction based on Improved Asymmetrically Reweighted Penalized Least Squares (IarPLS). |
|
Baseline correction based on Adaptive Smoothness Penalized Least Squares (asPLS). |
Polynomial fitting:
|
Baseline correction based on polynomial fitting. |
|
Baseline correction based on modified polynomial fitting. |
|
Baseline correction based on penalised polynomial fitting. |
|
Baseline correction based on improved modified polynomial fitting. |
Other:
|
Baseline correction based on Goldindec. |
|
Baseline correction based on Iterative Reweighted Spline Quantile Regression (IRSQR). |
|
Baseline correction based on Corner Cutting. |
|
Baseline correction based on Fully automatic baseline correction (FABC). |
Normalisation/Scaling
|
Vector normalisation. |
|
Min-max normalisation. |
|
Max intensity normalisation. |
|
Area under the curve normalisation. |
See also
Check the Built-in methods tutorial for more information about how to access and use the preprocessing algorithms built into RamanSPy.
Custom algorithms
Alternatively, users can use RamanSPy to create their own preprocessing methods by wrapping preprocessing functions of the
correct type within PreprocessingStep instances.
See also
Check the Custom methods tutorial for more information about how to define custom preprocessing methods using RamanSPy.
Pipelines
In most applications, there are several preprocessing procedures, which need to be performed on the experimental Raman spectroscopic data one’s working with before proceeding with the consecutive analysis. The construction and customisation of such complex preprocessing pipelines are usually software-intensive tasks, which are unnecessarily challenging.
This is why RamanSPy also provides tools for the smooth development of preprocessing pipelines. This is made
possible through the introduction of the Pipeline class.
- class ramanspy.preprocessing.Pipeline(pipeline: List[PreprocessingStep])[source]
Defines a preprocessing pipeline consisting of multiple preprocessing procedures.
- Parameters:
pipeline (list[
PreprocessingStep]) – The preprocessing procedures defining the pipeline.
Example
from ramanspy import preprocessing preprocessing_pipeline = preprocessing.Pipeline([ preprocessing.PreprocessingStep(some_custom_preprocessing_func, *args, **kwargs), preprocessing.denoise.SavGol(window_length=7, polyorder=3), preprocessing.normalise.Vector() ])
- append(step: PreprocessingStep)[source]
Append a preprocessing procedure to the pipeline.
- Parameters:
step (
PreprocessingStep) – The preprocessing procedure to append to the pipeline.
- apply(raman_objects: SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume]]) SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume | List[SpectralContainer | Spectrum | SpectralImage | SpectralVolume]][source]
Preprocess Raman spectroscopic data using the initialised pipeline.
The single point-of-contact method of the
Pipelineclass.- Parameters:
raman_objects (Union[SpectralObject, List[Union[SpectralObject, List[SpectralObject]]]]) – The objects to preprocess, where SpectralObject := Union[SpectralContainer, Spectrum, SpectralImage, SpectralVolume].
- Returns:
The preprocessed objects, where SpectralObject := Union[SpectralContainer, Spectrum, SpectralImage, SpectralVolume].
- Return type:
Union[SpectralObject, List[Union[SpectralObject, List[SpectralObject]]]]
Note
The
PreprocessingStepobjects comprising thePipelineinstance will be applied sequentially in the order provided during initialisation.Example
# once a preprocessing pipeline is initialised, it can be applied to different Raman data just as single PreprocessingStep instances preprocessed_data = preprocessing_pipeline.apply(raman_object) preprocessed_data = preprocessing_method.apply([raman_object, raman_spectrum, raman_image]) preprocessed_data = preprocessing_method.apply([raman_object, raman_spectrum], raman_object, [raman_spectrum, raman_image])
- extend(steps: List[PreprocessingStep])[source]
Extend the pipeline with multiple preprocessing procedures.
- Parameters:
steps (list[
PreprocessingStep]) – The preprocessing procedures to extend the pipeline with.
- insert(index: int, step: PreprocessingStep)[source]
Insert a preprocessing procedure into the pipeline at a specified index.
- Parameters:
index (int) – The index at which to insert the preprocessing procedure.
step (
PreprocessingStep) – The preprocessing procedure to insert into the pipeline.
Custom pipelines
The preprocessing methods offered by the package and other custom algorithms (wrapped as PreprocessingStep instances),
can easily be stacked together using RamanSPy into complete multi-layered preprocessing pipelines that work just as single
PreprocessingStep instances do.
To create a preprocessing pipeline, simply wrap a list of preprocessing methods within a Pipeline instance.
See also
Check the Custom pipelines tutorial for more information about how to construct and execute custom preprocessing pipelines.
Established protocols
RamanSPy also provides preprocessing protocols already proposed in the literature. This allows users to select pre-configured preprocessing pipelines without having to worry about the choice of methods and parameters. These can be accessed through the following methods:
|
The first preprocessing protocol used in the paper by Georgiev et al. (2023) [1]_. |
|
The third preprocessing protocol used in the paper by Georgiev et al. (2023) [1]_. |
|
The third preprocessing protocol used in the paper by Georgiev et al. (2023) [1]_. |
A basic preprocessing protocol approximating the one adopted in Bergholt MS et al. (2016) [1]_. |
See also
Check the Built-in protocols tutorial for more information about how to access and use the preprocessing protocols built into RamanSPy.