17 research outputs found

    Principal Component Analysis Coupled with Artificial Neural Networks—A Combined Technique Classifying Small Molecular Structures Using a Concatenated Spectral Database

    Get PDF
    In this paper we present several expert systems that predict the class identity of the modeled compounds, based on a preprocessed spectral database. The expert systems were built using Artificial Neural Networks (ANN) and are designed to predict if an unknown compound has the toxicological activity of amphetamines (stimulant and hallucinogen), or whether it is a nonamphetamine. In attempts to circumvent the laws controlling drugs of abuse, new chemical structures are very frequently introduced on the black market. They are obtained by slightly modifying the controlled molecular structures by adding or changing substituents at various positions on the banned molecules. As a result, no substance similar to those forming a prohibited class may be used nowadays, even if it has not been specifically listed. Therefore, reliable, fast and accessible systems capable of modeling and then identifying similarities at molecular level, are highly needed for epidemiological, clinical, and forensic purposes. In order to obtain the expert systems, we have preprocessed a concatenated spectral database, representing the GC-FTIR (gas chromatography-Fourier transform infrared spectrometry) and GC-MS (gas chromatography-mass spectrometry) spectra of 103 forensic compounds. The database was used as input for a Principal Component Analysis (PCA). The scores of the forensic compounds on the main principal components (PCs) were then used as inputs for the ANN systems. We have built eight PC-ANN systems (principal component analysis coupled with artificial neural network) with a different number of input variables: 15 PCs, 16 PCs, 17 PCs, 18 PCs, 19 PCs, 20 PCs, 21 PCs and 22 PCs. The best expert system was found to be the ANN network built with 18 PCs, which accounts for an explained variance of 77%. This expert system has the best sensitivity (a rate of classification C = 100% and a rate of true positives TP = 100%), as well as a good selectivity (a rate of true negatives TN = 92.77%). A comparative analysis of the validation results of all expert systems is presented, and the input variables with the highest discrimination power are discussed

    Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment

    No full text
    An essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA) is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clusters and by their dispersion. We are proposing a set of indicators inspired from analytical geometry that may be used for an objective quantitative assessment of the data clustering quality as well as a global clustering quality coefficient (GCQC) that is a measure of the overall predictive power of the PCA models. The use of these indicators for evaluating the efficiency of the PCA class assignment is illustrated by a comparative study performed for the identification of the preprocessing function that is generating the most efficient PCA system screening for amphetamines based on their GC-FTIR spectra. The GCQC ranking of the tested feature weights is explained based on estimated density distributions and validated by using quadratic discriminant analysis (QDA)

    Computerized Detection of JWH Synthetic Cannabinoids Class Membership Based on Machine Learning Algorithms and Molecular Descriptors

    No full text
    An Artificial Neural Networks (ANN) model identifying JWH Synthetic Cannabinoids, that we have developed based on a combination of topological, 3D-MoRSE (Molecule Representation of Structure based on Electron diffraction) and ADMET (Absorption, Distribution, Metabolism, Excretion and Toxicity) molecular descriptors, is described and analyzed. The validation results indicate that this computerized system has a very high potential for efficiently predicting the class membership of JWH and discriminating them from a large variety of (non-JWH) substances of forensic interest

    Sensitivity Analysis of Artificial Neural Networks Identifying JWH Synthetic Cannabinoids Built with Alternative Training Strategies and Methods

    No full text
    This paper presents the alternative training strategies we tested for an Artificial Neural Network (ANN) designed to detect JWH synthetic cannabinoids. In order to increase the model performance in terms of output sensitivity, we used the Neural Designer data science and machine learning platform combined with the programming language Python. We performed a comparative analysis of several optimization algorithms, error parameters and regularization methods. Finally, we performed a new goodness-of-fit analysis between the testing samples in the data set and the corresponding ANN outputs in order to investigate their sensitivity. The effectiveness of the new methods combined with the optimization algorithms is discussed

    Automatic identification of hallucinogenic amphetamines based on their ATR-FTIR spectra processed with Convolutional Neural Networks

    No full text
    New psychoactive drugs that are leading to severe intoxications are constantly seized on the European black market. Recent studies indicate that most of these new substances are synthetic cannabinoids and hallucinogenic amphetamines. In this study, we are presenting the results obtained with an expert system that was built to identify automatically the class identity of these types of drugs of abuse, based on their Attenuated Total Reflection-Fourier Transform Infrared (ATR-FTIR) spectra processed with Convolutional Neural Networks (CNNs). CNNs have been applied with great success in recent years in various computer applications, such as image classification, but little work has been done in using this kind of deep learning models for spectral data classification. The aim of this study was to improve the detection accuracy (classification performance) that we have already obtained with other statistical mathematics and artificial intelligence techniques. The performances of the CNN system are discussed in comparison with those of the later models

    Sensitivity Analysis of Artificial Neural Networks Identifying JWH Synthetic Cannabinoids Built with Alternative Training Strategies and Methods

    No full text
    This paper presents the alternative training strategies we tested for an Artificial Neural Network (ANN) designed to detect JWH synthetic cannabinoids. In order to increase the model performance in terms of output sensitivity, we used the Neural Designer data science and machine learning platform combined with the programming language Python. We performed a comparative analysis of several optimization algorithms, error parameters and regularization methods. Finally, we performed a new goodness-of-fit analysis between the testing samples in the data set and the corresponding ANN outputs in order to investigate their sensitivity. The effectiveness of the new methods combined with the optimization algorithms is discussed

    Artificial Neural Networks Screening for JWH Synthetic Cannabinoids: a Comparative Analysis Regarding their Specificity and Accuracy

    No full text
    <p>This study evaluates the impact of the dataset size and of the number of molecular descriptors selected to build Artificial Neural Networks (ANN) screening for JWH synthetic cannabinoids. The aim is to determine how to most economically use the available data on these illicit drugs and still avoid overfitting. The results indicate a proportional decrease in the number of inputs in terms of memory requirements, processing speed, and numerical precision by fitting a model with the same database of designer drugs and the same test set for each different sized training dataset (having 100, 50 and 25 samples respectively). The results indicate that the model trained with 100 samples performs nearly as well as the reference ANN system (built with 150 samples), but only modest results are recorded for training sets consisting of 50 or 25 samples.</p&gt
    corecore