1,372 research outputs found

    Big Data meets Quantum Chemistry Approximations: The Δ\Delta-Machine Learning Approach

    Full text link
    Chemically accurate and comprehensive studies of the virtual space of all possible molecules are severely limited by the computational cost of quantum chemistry. We introduce a composite strategy that adds machine learning corrections to computationally inexpensive approximate legacy quantum methods. After training, highly accurate predictions of enthalpies, free energies, entropies, and electron correlation energies are possible, for significantly larger molecular sets than used for training. For thermochemical properties of up to 16k constitutional isomers of C7_7H10_{10}O2_2 we present numerical evidence that chemical accuracy can be reached. We also predict electron correlation energy in post Hartree-Fock methods, at the computational cost of Hartree-Fock, and we establish a qualitative relationship between molecular entropy and electron correlation. The transferability of our approach is demonstrated, using semi-empirical quantum chemistry and machine learning models trained on 1 and 10\% of 134k organic molecules, to reproduce enthalpies of all remaining molecules at density functional theory level of accuracy

    Atomistic potential for graphene and other sp2^2 carbon systems

    Full text link
    We introduce a torsional force field for sp2^2 carbon to augment an in-plane atomistic potential of a previous work (Kalosakas et al, J. Appl. Phys. {\bf 113}, 134307 (2013)) so that it is applicable to out-of-plane deformations of graphene and related carbon materials. The introduced force field is fit to reproduce DFT calculation data of appropriately chosen structures. The aim is to create a force field that is as simple as possible so it can be efficient for large scale atomistic simulations of various sp2^2 carbon structures without significant loss of accuracy. We show that the complete proposed potential reproduces characteristic properties of fullerenes and carbon nanotubes. In addition, it reproduces very accurately the out-of-plane ZA and ZO modes of graphene's phonon dispersion as well as all phonons with frequencies up to 1000~cm1^{-1}.Comment: 9 pages, 6 figure

    Formula Periodic Table for the Isomer Classes of Acyclic Hydrocarbons - Enumerative and Asymptotic Characteristics

    Get PDF
    The overall set of acyclic hydrocarbons CnH2m with classical valence structures is considered, the structural isomers are enumerated, and the results displayed in the form of a »periodic table« with the C atom count n and H atom half-count m respectively identifying rows and columns. Asymptotic n → ∞ behaviors of these enumerations are developed, first for fixed degree u ≡ n + 1 - m of unsaturation and second for fixed number 2m of H-atoms. The first-set isomer classes increase in size exponentially fast with n, whereas with the second set, the isomer-class sizes increase sub-exponentially, as a power of n

    Characterization of probiotic Lactobacillus spp. isolates from commercial fermented milks

    Get PDF
    The aim of this project was to study the identity of probiotic lactobacilli in fermented milk products from the United Kingdom/European markets during their survival during shelf-life. This in vitro study was also aimed at undertaking studies on some of the physiological probiotic criteria, such as resistance to stomach/intestine conditions and also possible functional properties of the isolates, such as antimicrobial activities, antibiotic resistance/susceptibility and antibiotic resistance genes, biofilm formation and production of conjugated linoleic acid (CLA). Primarily, a comparative study was carried out on selectivity of MRS-Clindamycin, MRS-Sorbitol and MRS-IM Maltose, to select the right medium for enumeration of probiotic Lactobacillus. Based on selectivity of medium for recovery of the targeted lactobacilli and also simplicity of preparation, MRS-Clindamycin was chosen as the best medium for enumeration of probiotic Lactobacillus in fermented milks. The results of enumeration of lactobacilli showed that 22 out of a total 36 tested products contained more than 106 colony forming units/g at the end of their shelf-life, which comply with the recommended minimum therapeutic level for probiotics. Rep-PCR using primer GTG-5 was applied for initial discrimination of isolated strains, and isolates, which presented different band profile, were placed in different groups. The isolated Lactobacillus spp. were identified mainly as Lactobacillus acidophilus, Lactobacillus casei and Lactobacillus paracasei by analysis of partial sequences of the 16S ribosomal RNA and rpoA genes. In order to characterize the isolates for probiotic properties, this study was focused on six Lactobacillus isolates along with two commercial Lactobacillus cultures from Chr. Hansen (Lactobacillus acidophilus La5 and Lactobacillus casei C431) and three Lactobacillus type strains (Lactobacillus casei subsp. casei, Lactobacillus paracasei subsp. paracasei and Lactobacillus acidophilus) which were purchased from NCIMB. The stomach and intestine conditions were mimiced using a batch culture fermentation system, and the combined effects of pH, enzymes and bile salts on survival of tested isolates was tested. The tested isolates were able to survive at low pH environment and also high concentrations of bile salts of the upper digestive tract. The potential of tested isolates for biofilm formation was determined in different conditions of nutritional and physiological stresses. The capability of tested isolates to produce biofilm in nutrient rich medium was recorded. However, the growth limitation, such as nutrient shortage in diluted media and also using inulin rather than glucose in synthetic medium, did not induce biofilm formation. Antimicrobial activities of tested bacteria against indicator bacteria namely Escherichia coli NCTC 12900, Salmonella enterica serovar Typhimurium DT124 and Salmonella enterica serovar Enteritidis PT4 and Lactobacillus delbruckii subsp. bulgaricus were studied. The production of organic acids and bacteriocin was considered as key mechanisms for antimicrobial activity of tested strains. Screening the isolates competence for production of CLA demonstrated that this feature is species dependent and also entirely related to the level of initial linoleic acid in the medium. Eleven tested isolates were also assessed for their antibiotic resistance profile by determination of minimum inhibitory concentration (MIC). The acquired resistance to cefoxitin, ceftriaxone, chloramphenicol, erythromycin, gentamycin, kanamycin, lincomycin, streptomycin, tylosin tartarate, tetracycline and vancomycin was observed in all tested isolates. Also their genetic background of antibiotic resistance genes was studied by PCR reactions and none of the tested isolates showed positive bands for investigated resistance genes

    QM7-X: A comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules

    Get PDF
    We introduce QM7-X, a comprehensive dataset of 42 physicochemical properties for \approx 4.2 M equilibrium and non-equilibrium structures of small organic molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this fundamentally important region of chemical compound space (CCS), QM7-X includes an exhaustive sampling of (meta-)stable equilibrium structures - comprised of constitutional/structural isomers and stereoisomers, e.g., enantiomers and diastereomers (including cis-/trans- and conformational isomers) - as well as 100 non-equilibrium structural variations thereof to reach a total of \approx 4.2 M molecular structures. Computed at the tightly converged quantum-mechanical PBE0+MBD level of theory, QM7-X contains global (molecular) and local (atom-in-a-molecule) properties ranging from ground state quantities (such as atomization energies and dipole moments) to response quantities (such as polarizability tensors and dispersion coefficients). By providing a systematic, extensive, and tightly-converged dataset of quantum-mechanically computed physicochemical properties, we expect that QM7-X will play a critical role in the development of next-generation machine-learning based models for exploring greater swaths of CCS and performing in silico design of molecules with targeted properties

    Mycobiota and fumonisin contamination in dried fruits of different origin

    Get PDF
    TFumonisins are carcinogenic mycotoxins which were originally identified in Fusarium verticillioides. According to recent findings, fumonisins are also produced by some black Aspergillus species including Aspergillus niger and A. awamori. Aspergilli are able to produce fumonisins in high quantities on agar media with low water activities. Data on the occurrence and role of this species in fumonisin contamination of agricultural products with high sugar content are needed to clarify the importance of A. niger in human health. The mycobiota and fumonisin contamination of various dried fruit samples collected form different countries were examined to clarify the role of black Aspergilli in fumonisin contamination of such products. All except two of the examined raisin samples were contaminated with black Aspergilli. Species assignment of the isolates was carried out using sequence analysis of part of the calmodulin gene. The range of fumonisin isomers present in the raisin samples, and produced by A. niger and A. awamori isolates collected from dried vine fruits was also examined using reversed-phase high-performance liquid chromatography/electrospray ionization - ion trap mass spectrometry. Among the A. niger/A. awamori isolates identified, 67% produced fumonisins. The isolates produced several fumonisin isomers also present in the dried vine fruit samples, including fumonisins BM, 3-epi-FB,, 3- epi-FBj. iso-FB|, and two iso-FB;.j forms. Most of these isomers have previously only been identified in Fusarium species. The average fumonisin content of the 7 dried vine fruit samples which were found to be contaminated by potential fumonisin producing black Aspergilli was 7.22 mg kg'1 . Our data indicate that A. niger and A. awamori are responsible for fumonisin contamination of dried vine fruits worldwide. The observed levels of contamination are alarming and pose a new threat for food safety. Preliminary data also indicate that fumonisin contamination of other dried fruits including figs and dates, and that of onions are also caused primarily by black Aspergillus species. Further work is in progress to examine the role of black Aspergilli in fumonisin and ochratoxin contamination of agricultural products

    De novo sequencing of heparan sulfate saccharides using high-resolution tandem mass spectrometry

    Get PDF
    Heparan sulfate (HS) is a class of linear, sulfated polysaccharides located on cell surface, secretory granules, and in extracellular matrices found in all animal organ systems. It consists of alternately repeating disaccharide units, expressed in animal species ranging from hydra to higher vertebrates including humans. HS binds and mediates the biological activities of over 300 proteins, including growth factors, enzymes, chemokines, cytokines, adhesion and structural proteins, lipoproteins and amyloid proteins. The binding events largely depend on the fine structure - the arrangement of sulfate groups and other variations - on HS chains. With the activated electron dissociation (ExD) high-resolution tandem mass spectrometry technique, researchers acquire rich structural information about the HS molecule. Using this technique, covalent bonds of the HS oligosaccharide ions are dissociated in the mass spectrometer. However, this information is complex, owing to the large number of product ions, and contains a degree of ambiguity due to the overlapping of product ion masses and lability of sulfate groups; as a result, there is a serious barrier to manual interpretation of the spectra. The interpretation of such data creates a serious bottleneck to the understanding of the biological roles of HS. In order to solve this problem, I designed HS-SEQ - the first HS sequencing algorithm using high-resolution tandem mass spectrometry. HS-SEQ allows rapid and confident sequencing of HS chains from millions of candidate structures and I validated its performance using multiple known pure standards. In many cases, HS oligosaccharides exist as mixtures of sulfation positional isomers. I therefore designed MULTI-HS-SEQ, an extended version of HS-SEQ targeting spectra coming from more than one HS sequence. I also developed several pre-processing and post-processing modules to support the automatic identification of HS structure. These methods and tools demonstrated the capacity for large-scale HS sequencing, which should contribute to clarifying the rich information encoded by HS chains as well as developing tailored HS drugs to target a wide spectrum of diseases

    Short-communication: study of fatty acid metabolites in microbial conjugated fatty acids-enrichment of milk and discovery of additional undescribed conjugated linolenic acid isomers

    Get PDF
    Microbially enriched food in conjugated linoleic (CLA) and conjugated linolenic (CLNA) acids is intensively studied nowadays. The conversion of linoleic (LA) and α-linolenic acids (α-LNA) into these compounds may involve different fatty acid (FA) intermediates. This research aimed to investigate potential FA byproducts in milk during microbial CLA/CLNA-enrichment using Bifidobacterium breve DSM 20091. Milk fermented with pure α-LNA showed a decrease in free myristic acid, while pure LA led to an increase in free stearic acid. No additional FA compounds were found alongside CLA/CLNA isomers. The strain produced several CLA isomers from LA, but only when administered alone. Nonetheless, when α-LNA was assayed, additional CLNA isomers, never reported before for bifidobacteria, were observed. In conclusion, except for stearic acid in the presence of LA, no side-FA metabolites were released during milk microbial CLA/CLNA-enrichment. Results suggest either CLA/CLNA production occurs in one single-step or intermediates biotransformation is very fast.N/

    Computational Tools for the Untargeted Assignment of FT-MS Metabolomics Datasets

    Get PDF
    Metabolomics is the study of metabolomes, the sets of metabolites observed in living systems. Metabolism interconverts these metabolites to provide the molecules and energy necessary for life processes. Many disease processes, including cancer, have a significant metabolic component that manifests as differences in what metabolites are present and in what quantities they are produced and utilized. Thus, using metabolomics, differences between metabolomes in disease and non-disease states can be detected and these differences improve our understanding of disease processes at the molecular level. Despite the potential benefits of metabolomics, the comprehensive investigation of metabolomes remains difficult. A popular analytical technique for metabolomics is mass spectrometry. Advances in Fourier transform mass spectrometry (FT-MS) instrumentation have yielded simultaneous improvements in mass resolution, mass accuracy, and detection sensitivity. In the metabolomics field, these advantages permit more complicated, but more informative experimental designs such as the use of multiple isotope-labeled precursors in stable isotope-resolved metabolomics (SIRM) experiments. However, despite these potential applications, several outstanding problems hamper the use of FT-MS for metabolomics studies. First, artifacts and data quality problems in FT-MS spectra can confound downstream data analyses, confuse machine learning models, and complicate the robust detection and assignment of metabolite features. Second, the assignment of observed spectral features to metabolites remains difficult. Existing targeted approaches for assignment often employ databases of known metabolites; however, metabolite databases are incomplete, thus limiting or biasing assignment results. Additionally, FT-MS provides limited structural information for observed metabolites, which complicates the determination of metabolite class (e.g. lipid, sugar, etc. ) for observed metabolite spectral features, a necessary step for many metabolomics experiments. To address these problems, a set of tools were developed. The first tool identifies artifacts with high peak density observed in many FT-MS spectra and removes them safely. Using this tool, two previously unreported types of high peak density artifact were identified in FT-MS spectra: fuzzy sites and partial ringing. Fuzzy sites were particularly problematic as they confused and reduced the accuracy of machine learning models trained on datasets containing these artifacts. Second, a tool called SMIRFE was developed to assign isotope-resolved molecular formulas to observed spectral features in an untargeted manner without a database of expected metabolites. This new untargeted method was validated on a gold-standard dataset containing both unlabeled and 15N-labeled compounds and was able to identify 18 of 18 expected spectral features. Third, a collection of machine learning models was constructed to predict if a molecular formula corresponds to one or more lipid categories. These models accurately predict the correct one of eight lipid categories on our training dataset of known lipid and non-lipid molecular formulas with precisions and accuracies over 90% for most categories. These models were used to predict lipid categories for untargeted SMIRFE-derived assignments in a non-small cell lung cancer dataset. Subsequent differential abundance analysis revealed a sub-population of non-small cell lung cancer samples with a significantly increased abundance in sterol lipids. This finding implies a possible therapeutic role of statins in the treatment and/or prevention of non-small cell lung cancer. Collectively these tools represent a pipeline for FT-MS metabolomics datasets that is compatible with isotope labeling experiments. With these tools, more robust and untargeted metabolic analyses of disease will be possible
    corecore