711 research outputs found

    MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation

    Get PDF
    Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments—the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid's contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org

    Deep Learning for Genomics: A Concise Overview

    Full text link
    Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into "big data" disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning Application

    DeePromoter: Robust Promoter Predictor Using Deep Learning

    Get PDF
    The promoter region is located near the transcription start sites and regulates transcription initiation of the gene by controlling the binding of RNA polymerase. Thus, promoter region recognition is an important area of interest in the field of bioinformatics. Numerous tools for promoter prediction were proposed. However, the reliability of these tools still needs to be improved. In this work, we propose a robust deep learning model, called DeePromoter, to analyze the characteristics of the short eukaryotic promoter sequences, and accurately recognize the human and mouse promoter sequences. DeePromoter combines a convolutional neural network (CNN) and a long short-term memory (LSTM). Additionally, instead of using non-promoter regions of the genome as a negative set, we derive a more challenging negative set from the promoter sequences. The proposed negative set reconstruction method improves the discrimination ability and significantly reduces the number of false positive predictions. Consequently, DeePromoter outperforms the previously proposed promoter prediction tools. In addition, a web-server for promoter prediction is developed based on the proposed methods and made available at https://home.jbnu.ac.kr/NSCL/deepromoter.htm

    Predicting Protein Therapeutic Candidates for Bovine Babesiosis Using Secondary Structure Properties and Machine Learning

    Get PDF
    Bovine babesiosis causes significant annual global economic loss in the beef and dairy cattle industry. It is a disease instigated from infection of red blood cells by haemoprotozoan parasites of the genus Babesia in the phylum Apicomplexa. Principal species are Babesia bovis, Babesia bigemina, and Babesia divergens. There is no subunit vaccine. Potential therapeutic targets against babesiosis include members of the exportome. This study investigates the novel use of protein secondary structure characteristics and machine learning algorithms to predict exportome membership probabilities. The premise of the approach is to detect characteristic differences that can help classify one protein type from another. Structural properties such as a protein's local conformational classification states, backbone torsion angles ϕ (phi) and ψ (psi), solvent-accessible surface area, contact number, and half-sphere exposure are explored here as potential distinguishing protein characteristics. The presented methods that exploit these structural properties via machine learning are shown to have the capacity to detect exportome from non-exportome Babesia bovis proteins with an 86-92% accuracy (based on 10-fold cross validation and independent testing). These methods are encapsulated in freely available Linux pipelines setup for automated, high-throughput processing. Furthermore, proposed therapeutic candidates for laboratory investigation are provided for B. bovis, B. bigemina, and two other haemoprotozoan species, Babesia canis, and Plasmodium falciparum.</i

    Modeling Taxi Drivers' Behaviour for the Next Destination Prediction

    Full text link
    In this paper, we study how to model taxi drivers' behaviour and geographical information for an interesting and challenging task: the next destination prediction in a taxi journey. Predicting the next location is a well studied problem in human mobility, which finds several applications in real-world scenarios, from optimizing the efficiency of electronic dispatching systems to predicting and reducing the traffic jam. This task is normally modeled as a multiclass classification problem, where the goal is to select, among a set of already known locations, the next taxi destination. We present a Recurrent Neural Network (RNN) approach that models the taxi drivers' behaviour and encodes the semantics of visited locations by using geographical information from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to predict the exact coordinates of the next destination, overcoming the problem of producing, in output, a limited set of locations, seen during the training phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge 2015 dataset - based on the city of Porto -, obtaining better results with respect to the competition winner, whilst using less information, and on Manhattan and San Francisco datasets.Comment: preprint version of a paper submitted to IEEE Transactions on Intelligent Transportation System

    Automatische Schätzung der Körperpose mit CNNs und LSTMs

    Get PDF
    In this thesis, we present an end-to-end approach to human pose estimation task that based on a deep hybrid architecture that combines convolutional neural network (CNNs) and recurrent neural networks (RNNs)
    • …
    corecore