13,345 research outputs found

    Understanding Hidden Memories of Recurrent Neural Networks

    Full text link
    Recurrent neural networks (RNNs) have been successfully applied to various natural language processing (NLP) tasks and achieved better results than conventional methods. However, the lack of understanding of the mechanisms behind their effectiveness limits further improvements on their architectures. In this paper, we present a visual analytics method for understanding and comparing RNN models for NLP tasks. We propose a technique to explain the function of individual hidden state units based on their expected response to input texts. We then co-cluster hidden state units and words based on the expected response and visualize co-clustering results as memory chips and word clouds to provide more structured knowledge on RNNs' hidden states. We also propose a glyph-based sequence visualization based on aggregate information to analyze the behavior of an RNN's hidden state at the sentence-level. The usability and effectiveness of our method are demonstrated through case studies and reviews from domain experts.Comment: Published at IEEE Conference on Visual Analytics Science and Technology (IEEE VAST 2017

    Spatial ecology of white-clawed crayfish Austropotamobius pallipes and signal crayfish Pacifastacus leniusculus in upland rivers, Northern England

    Get PDF
    The American signal crayfish Pacifastacus lernusculus, an invasive species widely introduced throughout Europe, is a major threat to native European crayfish species and is causing increasing concern because of its wide impact on aquatic ecosystems. This thesis investigates the within catchment expansion of signal crayfish populations in two upland rivers and the spatial ecology and movement of the introduced signal crayfish and the indigenous white-clawed crayfish Austropotamobius pallipes. Populations of signal crayfish are established and expanding on the upland rivers Wharfe and Ure. On the Wharfe the signal crayfish population is well established and now occupies about 30 km of river and is currently expanding at a rate in excess of 2 km year(^-1). On the Ure the signal crayfish population is younger and currently occupies 1.6 km and is currently expanding at about 0.5 km year(^-1). The range expansion is biased towards downstream in both rivers, by a ratio of about 3:1 (downstream:upstream).The movements and dispersal of white-clawed and signal crayfish was studied utilising a combination of radiotelemetry and internal and external Passive Integrated Transponder (PIT) tags. Radiotagged adult signal crayfish were capable of substantial active movements (maximum movement 790m in 79 days). The level of movement of adults suggests they may have the potential to be responsible for the observed rates of population expansion. Although the movements of radiotagged adult signal crayfish within main river channel were equally distributed upstream and downstream, in-stream barriers both natural and artificial were found to limit the upstream movements of PIT tagged crayfish and this may contribute to the observed downstream bias of signal crayfish population expansion. The movements and dispersal of PIT tagged white-clawed crayfish within a small upland high gradient stream were strongly biased towards downstream. Maximum movement of radiotagged adult signal crayfish occurred during midsummer. Temperature appeared to be a major factor influencing the timing and extent of movements between tracking periods although there was a large variation between individuals. All significant downstream movements made by crayfish were active movements and not the result of passive movement during periods of high discharge. There were no sex or size differences in the dispersal and movement of radiotagged and PIT tagged signal crayfish whilst in PIT tagged white-clawed crayfish size, sex, injuries and duration of tracking influenced extent of movement. The expansion of the signal crayfish population in the River Wharfe appears to lead to the progressive loss of white-clawed crayfish populations where they come into direct contact. Limited differences in the microhabitat utilised by the two species were found where the species were syntopic, suggesting the potential exists for direct competition between the two species. In addition signal crayfish showed greater movement and dispersal than white-clawed crayfish. This may contribute to the ability of signal crayfish to colonise rivers rapidly and may also offer a competitive advantage over white-clawed crayfish thus contributing to the observed replacement. The results are discussed in the context of the conservation and management of crayfish and the ecology of invasive species

    Comparative Functional Analysis of the Caenorhabditis elegans and Drosophila melanogaster Proteomes

    Get PDF
    The nematode Caenorhabditis elegans is a popular model system in genetics, not least because a majority of human disease genes are conserved in C. elegans. To generate a comprehensive inventory of its expressed proteome, we performed extensive shotgun proteomics and identified more than half of all predicted C. elegans proteins. This allowed us to confirm and extend genome annotations, characterize the role of operons in C. elegans, and semiquantitatively infer abundance levels for thousands of proteins. Furthermore, for the first time to our knowledge, we were able to compare two animal proteomes (C. elegans and Drosophila melanogaster). We found that the abundances of orthologous proteins in metazoans correlate remarkably well, better than protein abundance versus transcript abundance within each organism or transcript abundances across organisms; this suggests that changes in transcript abundance may have been partially offset during evolution by opposing changes in protein abundance

    Current challenges in software solutions for mass spectrometry-based quantitative proteomics

    Get PDF
    This work was in part supported by the PRIME-XS project, grant agreement number 262067, funded by the European Union seventh Framework Programme; The Netherlands Proteomics Centre, embedded in The Netherlands Genomics Initiative; The Netherlands Bioinformatics Centre; and the Centre for Biomedical Genetics (to S.C., B.B. and A.J.R.H); by NIH grants NCRR RR001614 and RR019934 (to the UCSF Mass Spectrometry Facility, director: A.L. Burlingame, P.B.); and by grants from the MRC, CR-UK, BBSRC and Barts and the London Charity (to P.C.

    Development and Integration of Informatic Tools for Qualitative and Quantitative Characterization of Proteomic Datasets Generated by Tandem Mass Spectrometry

    Get PDF
    Shotgun proteomic experiments provide qualitative and quantitative analytical information from biological samples ranging in complexity from simple bacterial isolates to higher eukaryotes such as plants and humans and even to communities of microbial organisms. Improvements to instrument performance, sample preparation, and informatic tools are increasing the scope and volume of data that can be analyzed by mass spectrometry (MS). To accommodate for these advances, it is becoming increasingly essential to choose and/or create tools that can not only scale well but also those that make more informed decisions using additional features within the data. Incorporating novel and existing tools into a scalable, modular workflow not only provides more accurate, contextualized perspectives of processed data, but it also generates detailed, standardized outputs that can be used for future studies dedicated to mining general analytical or biological features, anomalies, and trends. This research developed cyber-infrastructure that would allow a user to seamlessly run multiple analyses, store the results, and share processed data with other users. The work represented in this dissertation demonstrates successful implementation of an enhanced bioinformatics workflow designed to analyze raw data directly generated from MS instruments and to create fully-annotated reports of qualitative and quantitative protein information for large-scale proteomics experiments. Answering these questions requires several points of engagement between informatics and analytical understanding of the underlying biochemistry of the system under observation. Deriving meaningful information from analytical data can be achieved through linking together the concerted efforts of more focused, logistical questions. This study focuses on the following aspects of proteomics experiments: spectra to peptide matching, peptide to protein mapping, and protein quantification and differential expression. The interaction and usability of these analyses and other existing tools are also described. By constructing a workflow that allows high-throughput processing of massive datasets, data collected within the past decade can be standardized and updated with the most recent analyses

    An evaluation of genotyping by sequencing (GBS) to map the <em>Breviaristatum-e (ari-e)</em> locus in cultivated barley

    Get PDF
    ABSTRACT: We explored the use of genotyping by sequencing (GBS) on a recombinant inbred line population (GPMx) derived from a cross between the two-rowed barley cultivar ‘Golden Promise’ (ari-e.GP/Vrs1) and the six-rowed cultivar ‘Morex’ (Ari-e/vrs1) to map plant height. We identified three Quantitative Trait Loci (QTL), the first in a region encompassing the spike architecture gene Vrs1 on chromosome 2H, the second in an uncharacterised centromeric region on chromosome 3H, and the third in a region of chromosome 5H coinciding with the previously described dwarfing gene Breviaristatum-e (Ari-e). BACKGROUND: Barley cultivars in North-western Europe largely contain either of two dwarfing genes; Denso on chromosome 3H, a presumed ortholog of the rice green revolution gene OsSd1, or Breviaristatum-e (ari-e) on chromosome 5H. A recessive mutant allele of the latter gene, ari-e.GP, was introduced into cultivation via the cv. ‘Golden Promise’ that was a favourite of the Scottish malt whisky industry for many years and is still used in agriculture today. RESULTS: Using GBS mapping data and phenotypic measurements we show that ari-e.GP maps to a small genetic interval on chromosome 5H and that alternative alleles at a region encompassing Vrs1 on 2H along with a region on chromosome 3H also influence plant height. The location of Ari-e is supported by analysis of near-isogenic lines containing different ari-e alleles. We explored use of the GBS to populate the region with sequence contigs from the recently released physically and genetically integrated barley genome sequence assembly as a step towards Ari-e gene identification. CONCLUSIONS: GBS was an effective and relatively low-cost approach to rapidly construct a genetic map of the GPMx population that was suitable for genetic analysis of row type and height traits, allowing us to precisely position ari-e.GP on chromosome 5H. Mapping resolution was lower than we anticipated. We found the GBS data more complex to analyse than other data types but it did directly provide linked SNP markers for subsequent higher resolution genetic analysis
    corecore