4,490 research outputs found

    Deep Learning for Genomics: A Concise Overview

    Full text link
    Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into "big data" disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning Application

    Antimicrobial peptide identification using multi-scale convolutional network

    Get PDF
    Background: Antibiotic resistance has become an increasingly serious problem in the past decades. As an alternative choice, antimicrobial peptides (AMPs) have attracted lots of attention. To identify new AMPs, machine learning methods have been commonly used. More recently, some deep learning methods have also been applied to this problem. Results: In this paper, we designed a deep learning model to identify AMP sequences. We employed the embedding layer and the multi-scale convolutional network in our model. The multi-scale convolutional network, which contains multiple convolutional layers of varying filter lengths, could utilize all latent features captured by the multiple convolutional layers. To further improve the performance, we also incorporated additional information into the designed model and proposed a fusion model. Results showed that our model outperforms the state-of-the-art models on two AMP datasets and the Antimicrobial Peptide Database (APD)3 benchmark dataset. The fusion model also outperforms the state-of-the-art model on an anti-inflammatory peptides (AIPs) dataset at the accuracy. Conclusions: Multi-scale convolutional network is a novel addition to existing deep neural network (DNN) models. The proposed DNN model and the modified fusion model outperform the state-of-the-art models for new AMP discovery. The source code and data are available at https://github.com/zhanglabNKU/APIN

    Functional Scaffolding for Musical Composition: A New Approach in Computer-Assisted Music Composition

    Get PDF
    While it is important for systems intended to enhance musical creativity to define and explore musical ideas conceived by individual users, many limit musical freedom by focusing on maintaining musical structure, thereby impeding the user\u27s freedom to explore his or her individual style. This dissertation presents a comprehensive body of work that introduces a new musical representation that allows users to explore a space of musical rules that are created from their own melodies. This representation, called functional scaffolding for musical composition (FSMC), exploits a simple yet powerful property of multipart compositions: The pattern of notes and rhythms in different instrumental parts of the same song are functionally related. That is, in principle, one part can be expressed as a function of another. Music in FSMC is represented accordingly as a functional relationship between an existing human composition, or scaffold, and an additional generated voice. This relationship is encoded by a type of artificial neural network called a compositional pattern producing network (CPPN). A human user without any musical expertise can then explore how these additional generated voices should relate to the scaffold through an interactive evolutionary process akin to animal breeding. The utility of this insight is validated by two implementations of FSMC called NEAT Drummer and MaestroGenesis, that respectively help users tailor drum patterns and complete multipart arrangements from as little as a single original monophonic track. The five major contributions of this work address the overarching hypothesis in this dissertation that functional relationships alone, rather than specialized music theory, are sufficient for generating plausible additional voices. First, to validate FSMC and determine whether plausible generated voices result from the human-composed scaffold or intrinsic properties of the CPPN, drum patterns are created with NEAT Drummer to accompany several different polyphonic pieces. Extending the FSMC approach to generate pitched voices, the second contribution reinforces the importance of functional transformations through quality assessments that indicate that some partially FSMC-generated pieces are indistinguishable from those that are fully human. While the third contribution focuses on constructing and exploring a space of plausible voices with MaestroGenesis, the fourth presents results from a two-year study where students discuss their creative experience with the program. Finally, the fifth contribution is a plugin for MaestroGenesis called MaestroGenesis Voice (MG-V) that provides users a more natural way to incorporate MaestroGenesis in their creative endeavors by allowing scaffold creation through the human voice. Together, the chapters in this dissertation constitute a comprehensive approach to assisted music generation, enabling creativity without the need for musical expertise

    Feature-based time-series analysis

    Full text link
    This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.Comment: 28 pages, 9 figure

    Principal Patterns on Graphs: Discovering Coherent Structures in Datasets

    Get PDF
    Graphs are now ubiquitous in almost every field of research. Recently, new research areas devoted to the analysis of graphs and data associated to their vertices have emerged. Focusing on dynamical processes, we propose a fast, robust and scalable framework for retrieving and analyzing recurring patterns of activity on graphs. Our method relies on a novel type of multilayer graph that encodes the spreading or propagation of events between successive time steps. We demonstrate the versatility of our method by applying it on three different real-world examples. Firstly, we study how rumor spreads on a social network. Secondly, we reveal congestion patterns of pedestrians in a train station. Finally, we show how patterns of audio playlists can be used in a recommender system. In each example, relevant information previously hidden in the data is extracted in a very efficient manner, emphasizing the scalability of our method. With a parallel implementation scaling linearly with the size of the dataset, our framework easily handles millions of nodes on a single commodity server
    • …
    corecore