The combination of Deep Learning techniques and Raman spectroscopy shows
great potential offering precise and prompt identification of pathogenic
bacteria in clinical settings. However, the traditional closed-set
classification approaches assume that all test samples belong to one of the
known pathogens, and their applicability is limited since the clinical
environment is inherently unpredictable and dynamic, unknown or emerging
pathogens may not be included in the available catalogs. We demonstrate that
the current state-of-the-art Neural Networks identifying pathogens through
Raman spectra are vulnerable to unknown inputs, resulting in an uncontrollable
false positive rate. To address this issue, first, we developed a novel
ensemble of ResNet architectures combined with the attention mechanism which
outperforms existing closed-world methods, achieving an accuracy of 87.8±0.1% compared to the best available model's accuracy of 86.7±0.4%.
Second, through the integration of feature regularization by the Objectosphere
loss function, our model achieves both high accuracy in identifying known
pathogens from the catalog and effectively separates unknown samples
drastically reducing the false positive rate. Finally, the proposed feature
regularization method during training significantly enhances the performance of
out-of-distribution detectors during the inference phase improving the
reliability of the detection of unknown classes. Our novel algorithm for Raman
spectroscopy enables the detection of unknown, uncatalogued, and emerging
pathogens providing the flexibility to adapt to future pathogens that may
emerge, and has the potential to improve the reliability of Raman-based
solutions in dynamic operating environments where accuracy is critical, such as
public safety applications