1,722 research outputs found

    IEEE Access special section editorial: scalable deep learning for big data

    Get PDF
    Deep learning (DL) has emerged as a key application exploiting the increasing computational power in systems such as GPUs, multicore processors, Systems-on-Chip (SoC), and distributed clusters. It has also attracted much attention in discovering correlation patterns in data in an unsupervised manner and has been applied in various domains including speech recognition, image classification, natural language processing, and computer vision. Unlike traditional machine learning (ML) approaches, DL also enables dynamic discovery of features from data. In addition, now, a number of commercial vendors also offer accelerators for deep learning systems (such as Nvidia, Intel, and Huawei)

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog

    A Revised Publication Model for ECML PKDD

    Full text link
    ECML PKDD is the main European conference on machine learning and data mining. Since its foundation it implemented the publication model common in computer science: there was one conference deadline; conference submissions were reviewed by a program committee; papers were accepted with a low acceptance rate. Proceedings were published in several Springer Lecture Notes in Artificial (LNAI) volumes, while selected papers were invited to special issues of the Machine Learning and Data Mining and Knowledge Discovery journals. In recent years, this model has however come under stress. Problems include: reviews are of highly variable quality; the purpose of bringing the community together is lost; reviewing workloads are high; the information content of conferences and journals decreases; there is confusion among scientists in interdisciplinary contexts. In this paper, we present a new publication model, which will be adopted for the ECML PKDD 2013 conference, and aims to solve some of the problems of the traditional model. The key feature of this model is the creation of a journal track, which is open to submissions all year long and allows for revision cycles.Comment: 13 page

    Bioinformatics and Medicine in the Era of Deep Learning

    Full text link
    Many of the current scientific advances in the life sciences have their origin in the intensive use of data for knowledge discovery. In no area this is so clear as in bioinformatics, led by technological breakthroughs in data acquisition technologies. It has been argued that bioinformatics could quickly become the field of research generating the largest data repositories, beating other data-intensive areas such as high-energy physics or astroinformatics. Over the last decade, deep learning has become a disruptive advance in machine learning, giving new live to the long-standing connectionist paradigm in artificial intelligence. Deep learning methods are ideally suited to large-scale data and, therefore, they should be ideally suited to knowledge discovery in bioinformatics and biomedicine at large. In this brief paper, we review key aspects of the application of deep learning in bioinformatics and medicine, drawing from the themes covered by the contributions to an ESANN 2018 special session devoted to this topic

    Computational Ontologies and Information Systems II: Formal Specification

    Get PDF
    This paper extends the study of ontologies in Part I of this study (Volume 14, Article 8) in the context of Information Systems. The basic foundations of computational ontologies presented in Part I are extended to formal specifications in this paper. This paper provides a review of the formalisms, languages, and tools for specifying and implementing computational ontologies Directions for future research are also provided

    The similarity metric

    Full text link
    A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new ``normalized information distance'', based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the {\em similarity metric}. This theory forms the foundation for a new practical tool. To evidence generality and robustness we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.Comment: 13 pages, LaTex, 5 figures, Part of this work appeared in Proc. 14th ACM-SIAM Symp. Discrete Algorithms, 2003. This is the final, corrected, version to appear in IEEE Trans Inform. T
    • …
    corecore