13,621 research outputs found

    Automatic covariate selection in logistic models for chest pain diagnosis: A new approach

    Get PDF
    A newly established method for optimizing logistic models via a minorization-majorization procedure is applied to the problem of diagnosing acute coronary syndromes (ACS). The method provides a principled approach to the selection of covariates which would otherwise require the use of a suboptimal method owing to the size of the covariate set. A strategy for building models is proposed and two models optimized for performance and for simplicity are derived via ten-fold cross-validation. These models confirm that a relatively small set of covariates including clinical and electrocardiographic features can be used successfully in this task. The performance of the models is comparable with previously published models using less principled selection methods. The models prove to be portable when tested on data gathered from three other sites. Whilst diagnostic accuracy and calibration diminishes slightly for these new settings, it remains satisfactory overall. The prospect of building predictive models that are as simple as possible for a required level of performance is valuable if data-driven decision aids are to gain wide acceptance in the clinical situation owing to the need to minimize the time taken to gather and enter data at the bedside

    The Triplet Genetic Code had a Doublet Predecessor

    Full text link
    Information theoretic analysis of genetic languages indicates that the naturally occurring 20 amino acids and the triplet genetic code arose by duplication of 10 amino acids of class-II and a doublet genetic code having codons NNY and anticodons GNN\overleftarrow{\rm GNN}. Evidence for this scenario is presented based on the properties of aminoacyl-tRNA synthetases, amino acids and nucleotide bases.Comment: 10 pages (v2) Expanded to include additional features, including likely relation to the operational code of the tRNA-acceptor stem. Version to be published in Journal of Theoretical Biolog

    Kernel-based machine learning protocol for predicting DNA-binding proteins

    Get PDF
    DNA-binding proteins (DNA-BPs) play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Attempts have been made to identify DNA-BPs based on their sequence and structural information with moderate accuracy. Here we develop a machine learning protocol for the prediction of DNA-BPs where the classifier is Support Vector Machines (SVMs). Information used for classification is derived from characteristics that include surface and overall composition, overall charge and positive potential patches on the protein surface. In total 121 DNA-BPs and 238 non-binding proteins are used to build and evaluate the protocol. In self-consistency, accuracy value of 100% has been achieved. For cross-validation (CV) optimization over entire dataset, we report an accuracy of 90%. Using leave 1-pair holdout evaluation, the accuracy of 86.3% has been achieved. When we restrict the dataset to less than 20% sequence identity amongst the proteins, the holdout accuracy is achieved at 85.8%. Furthermore, seven DNA-BPs with unbounded structures are all correctly predicted. The current performances are better than results published previously. The higher accuracy value achieved here originates from two factors: the ability of the SVM to handle features that demonstrate a wide range of discriminatory power and, a different definition of the positive patch. Since our protocol does not lean on sequence or structural homology, it can be used to identify or predict proteins with DNA-binding function(s) regardless of their homology to the known ones

    Nanomechanics of a Hydrogen Molecule Suspended between Two Equally Charged Tips

    Get PDF
    Geometric configuration and energy of a hydrogen molecule centered between two point-shaped tips of equal charge are calculated with the variational quantum Monte-Carlo (QMC) method without the restriction of the Born-Oppenheimer (BO) approximation. Ground state nuclear distribution, stability, and low vibrational excitation are investigated. Ground state results predict significant deviations from the BO treatment that is based on a potential energy surface (PES) obtained with the same QMC accuracy. The quantum mechanical distribution of molecular axis direction and bond length at a sub-nanometer level is fundamental for understanding nanomechanical dynamics with embedded hydrogen. Because of the tips' arrangement, cylindrical symmetry yields a uniform azimuthal distribution of the molecular axis vector relative to the tip-tip axis. With approaching tips towards each other, the QMC sampling shows an increasing loss of spherical symmetry with the molecular axis still uniformly distributed over the azimuthal angle but peaked at the tip-tip direction for negative tip charge while peaked at the equatorial plane for positive charge. This directional behavior can be switched between both stable configurations by changing the sign of the tip charge and by controlling the tip-tip distance. This suggests an application in the field of molecular machines.Comment: 20 pages, 10 figure
    corecore