687 research outputs found

    A knowledge-based weighting framework to boost the power of genome-wide association studies

    Get PDF
    Background: We are moving to second-wave analysis of genome-wide association studies (GWAS), characterized by comprehensive bioinformatical and statistical evaluation of genetic associations. Existing biological knowledge is very valuable for GWAS, which may help improve their detection power particularly for disease susceptibility loci of moderate effect size. However, a challenging question is how to utilize available resources that are very heterogeneous to quantitatively evaluate the statistic significances. Methodology/Principal Findings: We present a novel knowledge-based weighting framework to boost power of the GWAS and insightfully strengthen their explorative performance for follow-up replication and deep sequencing. Built upon diverse integrated biological knowledge, this framework directly models both the prior functional information and the association significances emerging from GWAS to optimally highlight single nucleotide polymorphisms (SNPs) for subsequent replication. In the theoretical calculation and computer simulation, it shows great potential to achieve extra over 15% power to identify an association signal of moderate strength or to use hundreds of whole-genome subjects fewer to approach similar power. In a case study on late-onset Alzheimer disease (LOAD) for a proof of principle, it highlighted some genes, which showed positive association with LOAD in previous independent studies, and two important LOAD related pathways. These genes and pathways could be originally ignored due to involved SNPs only having moderate association significance. Conclusions/Significance: With user-friendly implementation in an open-source Java package, this powerful framework will provide an important complementary solution to identify more true susceptibility loci with modest or even small effect size in current GWAS for complex diseases. © 2010 Li et al.published_or_final_versio

    : Protein Long Local Structure Prediction

    Get PDF
    International audienceA relevant and accurate description of three-dimensional (3D) protein structures can be achieved by characterizing recurrent local structures. In a previous study, we developed a library of 120 3D structural prototypes encompassing all known 11-residues long local protein structures and ensuring a good quality of structural approximation. A local structure prediction method was also proposed. Here, overlapping properties of local protein structures in global ones are taken into account to characterize frequent local networks. At the same time, we propose a new long local structure prediction strategy which involves the use of evolutionary information coupled with Support Vector Machines (SVMs). Our prediction is evaluated by a stringent geometrical assessment. Every local structure prediction with a Calpha RMSD less than 2.5 A from the true local structure is considered as correct. A global prediction rate of 63.1% is then reached, corresponding to an improvement of 7.7 points compared with the previous strategy. In the same way, the prediction of 88.33% of the 120 structural classes is improved with 8.65% mean gain. 85.33% of proteins have better prediction results with a 9.43% average gain. An analysis of prediction rate per local network also supports the global improvement and gives insights into the potential of our method for predicting super local structures. Moreover, a confidence index for the direct estimation of prediction quality is proposed. Finally, our method is proved to be very competitive with cutting-edge strategies encompassing three categories of local structure predictions. Proteins 2009. (c) 2009 Wiley-Liss, Inc

    Prediction of High-throughput Protein-Protein Interactions and Calmodulin Binding Using Short Linear Motifs

    Get PDF
    Prediction of protein-protein interactions (PPIs) is a difficult and important problem in biology. Although high-throughput technologies have made remarkable progress, the predictions are often inaccurate and include high rates of both false positives and false negatives. In addition, prediction of Calmodulin Binding Proteins (CaM-binding) is a problem that has been investigated deeply, though computational approaches for their prediction are not well developed. Short-linear motifs (SLiMs), on the other hand, are being effectively used as features for analyzing PPIs, though their properties have not been used in highthroughput interactions. We propose a new method for prediction of high-throughput PPIs and CaM binding proteins based on counting SLiMs in protein sequences with specific scoring functions. The method has been tested on a positive dataset of 50 protein pairs obtained from the PrePPI database, and a negative dataset of 38 protein pairs obtained from the Negatome-PDB 2.0 database, and 387 proteins from the CaM database. We have used Multiple EM for Motif Elucidation (MEME) to obtain motifs for each of the positive and negative datasets. Our method shows promising results and demonstrates that information contained in SLiMs is highly relevant for accurate prediction of high-throughput PPIs and CaM-binding proteins. In addition to efficient prediction, individual SLiMs bring extra information on patterns that may be linked to specific roles in protein function

    Improved Alzheimer’s disease detection by MRI using multimodal machine learning algorithms

    Get PDF
    Dementia is one of the huge medical problems that have challenged the public health sector around the world. Moreover, it generally occurred in older adults (age > 60). Shockingly, there are no legitimate drugs to fix this sickness, and once in a while it will directly influence individual memory abilities and diminish the human capacity to perform day by day exercises. Many health experts and computing scientists were performing research works on this issue for the most recent twenty years. All things considered, there is an immediate requirement for finding the relative characteristics that can figure out the identification of dementia. The motive behind the works presented in this thesis is to propose the sophisticated supervised machine learning model in the prediction and classification of AD in elder people. For that, we conducted different experiments on open access brain image information including demographic MRI data of 373 scan sessions of 150 patients. In the first two works, we applied single ML models called support vectors and pruned decision trees for the prediction of dementia on the same dataset. In the first experiment with SVM, we achieved 70% of the prediction accuracy of late-stage dementia. Classification of true dementia subjects (precision) is calculated as 75%. Similarly, in the second experiment with J48 pruned decision trees, the accuracy was improved to the value of 88.73%. Classification of true dementia cases with this model was comprehensively done and achieved 92.4% of precision. To enhance this work, rather than single modelling we employed multi-modelling approaches. In the comparative analysis of the machine learning study, we applied the feature reduction technique called principal component analysis. This approach identifies the high correlated features in the dataset that are closely associated with dementia type. By doing the simultaneous application of three models such as KNN, LR, and SVM, it has been possible to identify an ideal model for the classification of dementia subjects. When compared with support vectors, KNN and LR models comprehensively classified AD subjects with 97.6% and 98.3% of accuracy respectively. These values are relatively higher than the previous experiments. However, because of the AD severity in older adults, it should be mandatory to not leave true AD positives. For the classification of true AD subjects among total subjects, we enhanced the model accuracy by introducing three independent experiments. In this work, we incorporated two new models called Naïve Bayes and Artificial Neural Networks along support vectors and KNN. In the first experiment, models were independently developed with manual feature selection. The experimental outcome suggested that KNN 3 is the optimal model solution because of 91.32% of classification accuracy. In the second experiment, the same models were tested with limited features (with high correlation). SVM was produced a high 96.12% of classification accuracy and NB produced a 98.21% classification rate of true AD subjects. Ultimately, in the third experiment, we mixed these four models and created a new model called hybrid type modelling. Hybrid model performance is validated AU-ROC curve value which is 0.991 (i.e., 99.1% of classification accuracy) has achieved. All these experimental results suggested that the ensemble modelling approach with wrapping is an optimal solution in the classification of AD subjects

    Computational Intelligence in Healthcare

    Get PDF
    This book is a printed edition of the Special Issue Computational Intelligence in Healthcare that was published in Electronic

    Computational Intelligence in Healthcare

    Get PDF
    The number of patient health data has been estimated to have reached 2314 exabytes by 2020. Traditional data analysis techniques are unsuitable to extract useful information from such a vast quantity of data. Thus, intelligent data analysis methods combining human expertise and computational models for accurate and in-depth data analysis are necessary. The technological revolution and medical advances made by combining vast quantities of available data, cloud computing services, and AI-based solutions can provide expert insight and analysis on a mass scale and at a relatively low cost. Computational intelligence (CI) methods, such as fuzzy models, artificial neural networks, evolutionary algorithms, and probabilistic methods, have recently emerged as promising tools for the development and application of intelligent systems in healthcare practice. CI-based systems can learn from data and evolve according to changes in the environments by taking into account the uncertainty characterizing health data, including omics data, clinical data, sensor, and imaging data. The use of CI in healthcare can improve the processing of such data to develop intelligent solutions for prevention, diagnosis, treatment, and follow-up, as well as for the analysis of administrative processes. The present Special Issue on computational intelligence for healthcare is intended to show the potential and the practical impacts of CI techniques in challenging healthcare applications
    • …
    corecore