5 research outputs found

    Automatic NMR-Based Identification of Chemical Reaction Types in Mixtures of Co-Occurring Reactions

    Get PDF
    Abstract The combination of chemoinformatics approaches with NMR techniques and the increasing availability of data allow the resolution of problems far beyond the original application of NMR in structure elucidation/verification. The diversity of applications can range from process monitoring, metabolic profiling, authentication of products, to quality control. An application related to the automatic analysis of complex mixtures concerns mixtures of chemical reactions. We encoded mixtures of chemical reactions with the difference between the 1 H NMR spectra of the products and the reactants. All the signals arising from all the reactants of the co-occurring reactions were taken together (a simulated spectrum of the mixture of reactants) and the same was done for products. The difference spectrum is taken as the representation of the mixture of chemical reactions. A data set of 181 chemical reactions was used, each reaction manually assigned to one of 6 types. From this dataset, we simulated mixtures where two reactions of different types would occur simultaneously. Automatic learning methods were trained to classify the reactions occurring in a mixture from the 1 H NMR-based descriptor of the mixture. Unsupervised learning methods (self-organizing maps) produced a reasonable clustering of the mixtures by reaction type, and allowed the correct classification of 80% and 63% of the mixtures in two independent test sets of different similarity to the training set. With random forests (RF), the percentage of correct classifications was increased to 99% and 80% for the same test sets. The RF probability associated to the predictions yielded a robust indication of their reliability. This study demonstrates the possibility of applying machine learning methods to automatically identify types of co-occurring chemical reactions from NMR data. Using no explicit structural information about the reactions participants, reaction elucidation is performed without structure elucidation of the molecules in the mixtures

    A Survey on Evolutionary Computation Approaches to Feature Selection

    Get PDF
    Feature selection is an important task in data mining and machine learning to reduce the dimensionality of the data and increase the performance of an algorithm, such as a classification algorithm. However, feature selection is a challenging task due mainly to the large search space. A variety of methods have been applied to solve feature selection problems, where evolutionary computation (EC) techniques have recently gained much attention and shown some success. However, there are no comprehensive guidelines on the strengths and weaknesses of alternative approaches. This leads to a disjointed and fragmented field with ultimately lost opportunities for improving performance and successful applications. This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms. In addition, current issues and challenges are also discussed to identify promising areas for future research.</p

    Controlling NMR spin systems for quantum computation

    Get PDF
    Nuclear magnetic resonance is arguably both the best available quantum technology for implementing simple quantum computing experiments and the worst technology for building large scale quantum computers that has ever been seriously put forward. After a few years of rapid growth, leading to an implementation of Shor's quantum factoring algorithm in a seven-spin system, the field started to reach its natural limits and further progress became challenging. Rather than pursuing more complex algorithms on larger systems, interest has now largely moved into developing techniques for the precise and efficient manipulation of spin states with the aim of developing methods that can be applied in other more scalable technologies and within conventional NMR. However, the user friendliness of NMR implementations means that they remain popular for proof-of-principle demonstrations of simple quantum information protocols

    Simultaneous spectrophotometric and chemometric determination of cholesterol and mono-/polyunsaturated fatty acids

    Get PDF
    Scope and Method of Study: The ultimate goal of this research project was to complete the development of a simple, direct alternative method for the simultaneous quantitative determination of cholesterol and polyunsaturated fatty acids (PUFAs) in human serum by exploitation of various chemometric algorithms and consequent validation with the gas chromatography-mass spectrometry (GC-MS). In addition, oleic acid (OA) was also added as the eighth component and the performance of the various chemometric algorithms were compared. The study was also extended to various food and biological samples and chemometric algorithms were applied to obtain meaningful information of the data set.Findings and Conclusions: For the first part of the study, ridge regression (RR), P-matrix (PM), principal component regression (PCR), and partial least squares (PLS2) algorithms performed quite equally well enough than the K-matrix (KM) approach when applied to the study of prepared mixtures (synthetic sera) in chloroform solutions. The PLS in the form of PLS2 model was tested for intact human serum specimens, and yielded results for w-3 and w-6 PUFA data that are comparable when using the GC-MS. Similar results were also derived for the between methods w-6/w-3 ratios. The first part of the study, therefore, showed the dominance of PLS2 over the other chemometric models.The second part of the study showed that PLS1 algorithm yielded the least root mean square error of prediction (RMSEP) for all the lipid components as compared to all other algorithms. PLS1 yielded molar concentrations quite comparable with the GC-MS in the actual human serum samples. Inclusion of OA yielded high RMSEP despite attempts of utilizing the most robust algorithms like PLS1, PLS2, and PCR. GAPLS was able to successfully reduce the RMSEP for all the components over the non-GA PLS1 approach except for EPA and DHA. The spiking of human serum samples was also done in the study but the task is considered tedious for a typical clinical setting.In a study involving OA, LA, and LNA in vegetable oils, it has been shown that PCR, PLS2, and PLS1 algorithms compared quite equally well in the prediction sets and that PLS2 mostly yielded a better performance than PLS1 and PCR algorithms in the unknown samples.An extension of the assay was performed for the pattern recognition of biological and food samples. The assay was able to discriminate eleven clusters corresponding to different food and biological samples.The study has shown how the Purdie assay coupled with chemometric algorithms might provide alternatives to separations methods for the direct determination of lipids in human serum, vegetable oils, and their synthetic models. The advantages of this simple technology are the reduction in time and costs
    corecore