2,521 research outputs found

    Deep Learning for Genomics: A Concise Overview

    Full text link
    Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into "big data" disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning Application

    Artificial intelligence used in genome analysis studies

    Get PDF
    Next Generation Sequencing (NGS) or deep sequencing technology enables parallel reading of multiple individual DNA fragments, thereby enabling the identification of millions of base pairs in several hours. Recent research has clearly shown that machine learning technologies can efficiently analyse large sets of genomic data and help to identify novel gene functions and regulation regions. A deep artificial neural network consists of a group of artificial neurons that mimic the properties of living neurons. These mathematical models, termed Artificial Neural Networks (ANN), can be used to solve artificial intelligence engineering problems in several different technological fields (e.g., biology, genomics, proteomics, and metabolomics). In practical terms, neural networks are non-linear statistical structures that are organized as modelling tools and are used to simulate complex genomic relationships between inputs and outputs. To date, Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNN) have been demonstrated to be the best tools for improving performance in problem solving tasks within the genomic field

    Deep Learning Methods for Detection and Tracking of Particles in Fluorescence Microscopy Images

    Get PDF
    Studying the dynamics of sub-cellular structures such as receptors, filaments, and vesicles is a prerequisite for investigating cellular processes at the molecular level. In addition, it is important to characterize the dynamic behavior of virus structures to gain a better understanding of infection mechanisms and to develop novel drugs. To investigate the dynamics of fluorescently labeled sub-cellular and viral structures, time-lapse fluorescence microscopy is the most often used imaging technique. Due to the limited spatial resolution of microscopes caused by diffraction, these very small structures appear as bright, blurred spots, denoted as particles, in microscopy images. To draw statistically meaningful biological conclusions, a large number of such particles need to be analyzed. However, since manual analysis of fluorescent particles is very time consuming, fully automated computer-based methods are indispensable. We introduce novel deep learning methods for detection and tracking of multiple particles in fluorescence microscopy images. We propose a particle detection method based on a convolutional neural network which performs image-to-image mapping by density map regression and uses the adaptive wing loss. For particle tracking, we present a recurrent neural network that exploits past and future information in both forward and backward direction. Assignment probabilities across multiple detections as well as the probabilities for missing detections are computed jointly. To resolve tracking ambiguities using future information, several track hypotheses are propagated to later time points. In addition, we developed a novel probabilistic deep learning method for particle tracking, which is based on a recurrent neural network mimicking classical Bayesian filtering. The method includes both aleatoric and epistemic uncertainty, and provides valuable information about the reliability of the computed trajectories. Short and long-term temporal dependencies of individual object dynamics are exploited for state prediction, and assigned detections are used to update the predicted states. Moreover, we developed a convolutional Long Short-Term Memory neural network for combined particle tracking and colocalization analysis in two-channel microscopy image sequences. The network determines colocalization probabilities, and colocalization information is exploited to improve tracking. Short and long-term temporal dependencies of object motion as well as image intensities are taken into account to compute assignment probabilities jointly across multiple detections. We also introduce a deep learning method for probabilistic particle detection and tracking. For particle detection, temporal information is integrated to regress a density map and determine sub-pixel particle positions. For tracking, a fully Bayesian neural network is presented that mimics classical Bayesian filtering and takes into account both aleatoric and epistemic uncertainty. Uncertainty information of individual particle detections is considered. Network training for the developed deep learning-based particle tracking methods relies only on synthetic data, avoiding the need of time-consuming manual annotation. We performed an extensive evaluation of our methods based on image data of the Particle Tracking Challenge as well as on fluorescence microscopy images displaying virus proteins of HCV and HIV, chromatin structures, and cell-surface receptors. It turned out that the methods outperform previous methods

    CORENup: a combination of convolutional and recurrent deep neural networks for nucleosome positioning identification

    Get PDF
    Background: Nucleosomes wrap the DNA into the nucleus of the Eukaryote cell and regulate its transcription phase. Several studies indicate that nucleosomes are determined by the combined effects of several factors, including DNA sequence organization. Interestingly, the identification of nucleosomes on a genomic scale has been successfully performed by computational methods using DNA sequence as input data. Results: In this work, we propose CORENup, a deep learning model for nucleosome identification. CORENup processes a DNA sequence as input using one-hot representation and combines in a parallel fashion a fully convolutional neural network and a recurrent layer. These two parallel levels are devoted to catching both non-periodic and periodic DNA string features. A dense layer is devoted to their combination to give a final classification. Conclusions: Results computed on public data sets of different organisms show that CORENup is a state of the art methodology for nucleosome positioning identification based on a Deep Neural Network architecture. The comparisons have been carried out using two groups of datasets, currently adopted by the best performing methods, and CORENup has shown top performance both in terms of classification metrics and elapsed computation time

    To Transformers and Beyond: Large Language Models for the Genome

    Full text link
    In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool for tackling complex computational challenges. This review focuses on the transformative role of Large Language Models (LLMs), which are mostly based on the transformer architecture, in genomics. Building on the foundation of traditional convolutional neural networks and recurrent neural networks, we explore both the strengths and limitations of transformers and other LLMs for genomics. Additionally, we contemplate the future of genomic modeling beyond the transformer architecture based on current trends in research. The paper aims to serve as a guide for computational biologists and computer scientists interested in LLMs for genomic data. We hope the paper can also serve as an educational introduction and discussion for biologists to a fundamental shift in how we will be analyzing genomic data in the future

    Opportunities and obstacles for deep learning in biology and medicine

    Get PDF
    Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network\u27s prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine
    • …
    corecore