373 research outputs found

    Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires

    Full text link
    The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity in order to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic and (iv) machine learning methods applied to dissect, quantify and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology towards coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.Comment: 27 pages, 2 figure

    Pacific Symposium on Biocomputing 2023

    Get PDF
    The Pacific Symposium on Biocomputing (PSB) 2023 is an international, multidisciplinary conference for the presentation and discussion of current research in the theory and application of computational methods in problems of biological significance. Presentations are rigorously peer reviewed and are published in an archival proceedings volume. PSB 2023 will be held on January 3-7, 2023 in Kohala Coast, Hawaii. Tutorials and workshops will be offered prior to the start of the conference.PSB 2023 will bring together top researchers from the US, the Asian Pacific nations, and around the world to exchange research results and address open issues in all aspects of computational biology. It is a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, and other computational methods, as applied to biological problems, with emphasis on applications in data-rich areas of molecular biology.The PSB has been designed to be responsive to the need for critical mass in sub-disciplines within biocomputing. For that reason, it is the only meeting whose sessions are defined dynamically each year in response to specific proposals. PSB sessions are organized by leaders of research in biocomputing's 'hot topics.' In this way, the meeting provides an early forum for serious examination of emerging methods and approaches in this rapidly changing field

    Machine Learning for Multiclass Classification and Prediction of Alzheimer\u27s Disease

    Get PDF
    Alzheimer\u27s disease (AD) is an irreversible neurodegenerative disorder and a common form of dementia. This research aims to develop machine learning algorithms that diagnose and predict the progression of AD from multimodal heterogonous biomarkers with a focus placed on the early diagnosis. To meet this goal, several machine learning-based methods with their unique characteristics for feature extraction and automated classification, prediction, and visualization have been developed to discern subtle progression trends and predict the trajectory of disease progression. The methodology envisioned aims to enhance both the multiclass classification accuracy and prediction outcomes by effectively modeling the interplay between the multimodal biomarkers, handle the missing data challenge, and adequately extract all the relevant features that will be fed into the machine learning framework, all in order to understand the subtle changes that happen in the different stages of the disease. This research will also investigate the notion of multitasking to discover how the two processes of multiclass classification and prediction relate to one another in terms of the features they share and whether they could learn from one another for optimizing multiclass classification and prediction accuracy. This research work also delves into predicting cognitive scores of specific tests over time, using multimodal longitudinal data. The intent is to augment our prospects for analyzing the interplay between the different multimodal features used in the input space to the predicted cognitive scores. Moreover, the power of modality fusion, kernelization, and tensorization have also been investigated to efficiently extract important features hidden in the lower-dimensional feature space without being distracted by those deemed as irrelevant. With the adage that a picture is worth a thousand words, this dissertation introduces a unique color-coded visualization system with a fully integrated machine learning model for the enhanced diagnosis and prognosis of Alzheimer\u27s disease. The incentive here is to show that through visualization, the challenges imposed by both the variability and interrelatedness of the multimodal features could be overcome. Ultimately, this form of visualization via machine learning informs on the challenges faced with multiclass classification and adds insight into the decision-making process for a diagnosis and prognosis

    Artificial intelligence (AI) in rare diseases: is the future brighter?

    Get PDF
    The amount of data collected and managed in (bio)medicine is ever-increasing. Thus, there is a need to rapidly and efficiently collect, analyze, and characterize all this information. Artificial intelligence (AI), with an emphasis on deep learning, holds great promise in this area and is already being successfully applied to basic research, diagnosis, drug discovery, and clinical trials. Rare diseases (RDs), which are severely underrepresented in basic and clinical research, can particularly benefit from AI technologies. Of the more than 7000 RDs described worldwide, only 5% have a treatment. The ability of AI technologies to integrate and analyze data from different sources (e.g., multi-omics, patient registries, and so on) can be used to overcome RDs' challenges (e.g., low diagnostic rates, reduced number of patients, geographical dispersion, and so on). Ultimately, RDs' AI-mediated knowledge could significantly boost therapy development. Presently, there are AI approaches being used in RDs and this review aims to collect and summarize these advances. A section dedicated to congenital disorders of glycosylation (CDG), a particular group of orphan RDs that can serve as a potential study model for other common diseases and RDs, has also been included.info:eu-repo/semantics/publishedVersio

    MACHINE LEARNING METHODS FOR PREDICTION OF HUMAN INFECTIOUS VIRUS AND IMPUTATION OF HLA ALLELES

    Get PDF
    This dissertation contains three Chapters. The following is a concise description of each Chapters. In Chapter 1, we introduced the Random Forest, a machine learning method, to foresee whether a virus is capable of infecting humans. The Covid pandemic informs us the importance of predicting the ability of a zoonotic virus that can infect humans from its genomic sequence. We used the -mer with and as features of a virus to predict if it can affect humans. We further employed the Boruta algorithm to select the important features, then fed those important features into the Random Forest method to train the model and make predictions. After utilizing a dataset that is independent of the training dataset in the test procedure, the results show that the accuracy of the training step is almost the same as an existing model, however, the accuracy in the testing step is substantially improved. Moreover, the time consumption of our method is much less than the existing model. In Chapter 2, we developed a new application of Long Short-Term Memory (LSTM) deep learning method for the human leukocyte antigens (HLA) allele imputation and implemented it in a software package, called LSTM*HLA. Methods for HLA allele imputation utilize single nucleotide polymorphisms (SNPs) around HLA loci and their relationship with HLA alleles to predict HLA alleles. That is the similar fundamental scheme as Bidirectional LSTM. We organized several consecutive SNPs together as an element of inputs for each cell of the LSTM algorithm and made a final imputation for HLA alleles by averaging results from different sets of hyperparameters. We evaluated and compared the performance of our method with two commonly used methods for HLA imputation with seven real data sets: CookHLA as the representative of conventional approaches and Deep*HLA as the representative of machine learning methods. We find that our method not only performs well when the reference samples and the target samples are from the same ethnic group, but also achieves high accuracy when they are from distinctive ethnicities. Moreover, because deep learning methods hold the nature that is less dependent on Linkage Disequilibrium, LSTM*HLA could enhance the accuracy of low-frequent HLA alleles which has great influence in the fields of clinical research and personal care. In Chapter 3, we investigated how two factors, the sample size and the choice of reference samples, can affect the accuracy of HLA imputation since these two factors are important factors that need to be carefully considered in real studies. As our results show, greater than 50 individuals is highly recommended for a reference panel to achieve a high imputation accuracy. For the choice of reference panels, the reference panel with the same ethnicity as target samples is strongly suggested, expanding the reference panel with multiple similar ethnic groups may also improve the accuracy, however, augmenting the reference panel with unrelated ethnic groups would decrease the imputation accuracy

    Combining Molecular, Imaging, and Clinical Data Analysis for Predicting Cancer Prognosis

    Get PDF
    Cancer is one of the most detrimental diseases globally. Accordingly, the prognosis prediction of cancer patients has become a field of interest. In this review, we have gathered 43 stateof- the-art scientific papers published in the last 6 years that built cancer prognosis predictive models using multimodal data. We have defined the multimodality of data as four main types: clinical, anatomopathological, molecular, and medical imaging; and we have expanded on the information that each modality provides. The 43 studies were divided into three categories based on the modelling approach taken, and their characteristics were further discussed together with current issues and future trends. Research in this area has evolved from survival analysis through statistical modelling using mainly clinical and anatomopathological data to the prediction of cancer prognosis through a multi-faceted data-driven approach by the integration of complex, multimodal, and high-dimensional data containing multi-omics and medical imaging information and by applying Machine Learning and, more recently, Deep Learning techniques. This review concludes that cancer prognosis predictive multimodal models are capable of better stratifying patients, which can improve clinical management and contribute to the implementation of personalised medicine as well as provide new and valuable knowledge on cancer biology and its progression

    3D statistical shape analysis of the face in Apert syndrome

    Get PDF
    Timely diagnosis of craniofacial syndromes as well as adequate timing and choice of surgical technique are essential for proper care management. Statistical shape models and machine learning approaches are playing an increasing role in Medicine and have proven its usefulness. Frameworks that automate processes have become more popular. The use of 2D photographs for automated syndromic identification has shown its potential with the Face2Gene application. Yet, using 3D shape information without texture has not been studied in such depth. Moreover, the use of these models to understand shape change during growth and its applicability for surgical outcome measurements have not been analysed at length. This thesis presents a framework using state-of-the-art machine learning and computer vision algorithms to explore possibilities for automated syndrome identification based on shape information only. The purpose of this was to enhance understanding of the natural development of the Apert syndromic face and its abnormality as compared to a normative group. An additional method was used to objectify changes as result of facial bipartition distraction, a common surgical correction technique, providing information on the successfulness and on inadequacies in terms of facial normalisation. Growth curves were constructed to further quantify facial abnormalities in Apert syndrome over time along with 3D shape models for intuitive visualisation of the shape variations. Post-operative models were built and compared with age-matched normative data to understand where normalisation is coming short. The findings in this thesis provide markers for future translational research and may accelerate the adoption of the next generation diagnostics and surgical planning tools to further supplement the clinical decision-making process and ultimately to improve patients’ quality of life
    • …
    corecore