6,528 research outputs found

    Compressing DNA sequence databases with coil

    Get PDF
    Background: Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results: We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion: coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work

    Unanticipated Money, Output, and Prices in the Small Economy

    Get PDF

    Identification of Optimal Satellite Compositing Length Using GLOBE Budburst Measurements

    Get PDF
    Phenology, the study of recurring biological cycles and their connection to climate, is a critical and growing field of global change research. In particular, scientists now recognize that regular satellite monitoring of the timing and length of the terrestrial growing season is a valuable metric of biospheric responses to short- and long-term climate variability. While many methodologies exist with which to detect growing season dynamics, most have a poorly understood relationship to actual ground vegetation conditions. GLOBE schools, through participation in the budburst protocols, are helping to bridge this gap between satellite observations and ground conditions. In this research we show how GLOBE budburst data can be used to select the optimal satellite compositing length (a technique used to reduce cloud, snow, and atmospheric contamination). One- and two-week compositing lengths produced similar results, both of which were superior to monthly compositing. The longer compositing length, contrary to popular remote sensing lore, tended to predict an earlier initiation of growth due to removal of inflection points in the satellite greenness time series. Overall, the GLOBE budburst data were extremely useful but also contained several troubling artifacts probably relating to infrequent observation, errors in date reporting, and use of exotic species

    Designing for Schadenfreude (or, how to express well-being and see if you're boring people)

    No full text
    This position paper presents two studies of content not normally expressed in status updates—well-being and status feedback—and considers how they may be processed, valued and used for potential quality-of-life benefits in terms of personal and social reflection and awareness. Do I Tweet Good? (poor grammar intentional) is a site investigating more nuanced forms of status feedback than current microblogging sites allow, towards understanding self-identity, reflection, and online perception. Healthii is a tool for sharing physical and emotional well-being via status updates, investigating concepts of self-reflection and social awareness. Together, these projects consider furthering the value of microblogging on two fronts: 1) refining the online personal/social networking experience, and 2) using the status update for enhancing the personal/social experience in the offline world, and considering how to leverage that online/offline split. We offer results from two different methods of study and target groups—one co-workers in an academic setting, the other followers on Twitter—to consider how microblogging can become more than just a communication medium if it facilitates these types of reflective practice

    Anomaly Detection in Paleoclimate Records using Permutation Entropy

    Get PDF
    Permutation entropy techniques can be useful in identifying anomalies in paleoclimate data records, including noise, outliers, and post-processing issues. We demonstrate this using weighted and unweighted permutation entropy of water-isotope records in a deep polar ice core. In one region of these isotope records, our previous calculations revealed an abrupt change in the complexity of the traces: specifically, in the amount of new information that appeared at every time step. We conjectured that this effect was due to noise introduced by an older laboratory instrument. In this paper, we validate that conjecture by re-analyzing a section of the ice core using a more-advanced version of the laboratory instrument. The anomalous noise levels are absent from the permutation entropy traces of the new data. In other sections of the core, we show that permutation entropy techniques can be used to identify anomalies in the raw data that are not associated with climatic or glaciological processes, but rather effects occurring during field work, laboratory analysis, or data post-processing. These examples make it clear that permutation entropy is a useful forensic tool for identifying sections of data that require targeted re-analysis---and can even be useful in guiding that analysis.Comment: 15 pages, 7 figure

    Chandra Observation of the Radio Source / X-ray Gas Interaction in the Cooling Flow Cluster Abell 2052

    Get PDF
    We present a Chandra observation of Abell 2052, a cooling flow cluster with a central cD that hosts the complex radio source 3C 317. The data reveal ``holes'' in the X-ray emission that are coincident with the radio lobes. The holes are surrounded by bright ``shells'' of X-ray emission. The data are consistent with the radio source displacing and compressing, and at the same time being confined by, the X-ray gas. The compression of the X-ray shells appears to have been relatively gentle and, at most, slightly transonic. The pressure in the X-ray gas (the shells and surrounding cooler gas) is approximately an order of magnitude higher than the minimum pressure derived for the radio source, suggesting that an additional source of pressure is needed to support the radio plasma. The compression of the X-ray shells has speeded up the cooling of the shells, and optical emission line filaments are found coincident with the brightest regions of the shells.Comment: accepted for publication in ApJ Letters; for high-resolution color figures, see http://www.astro.virginia.edu/~elb6n/abell2052.htm

    Transcriptional repression by ApiAP2 factors is central to chronic toxoplasmosis

    Get PDF
    Tachyzoite to bradyzoite development in Toxoplasma is marked by major changes in gene expression resulting in a parasite that expresses a new repertoire of surface antigens hidden inside a modified parasitophorous vacuole called the tissue cyst. The factors that control this important life cycle transition are not well understood. Here we describe an important transcriptional repressor mechanism controlling bradyzoite differentiation that operates in the tachyzoite stage. The ApiAP2 factor, AP2IV-4, is a nuclear factor dynamically expressed in late S phase through mitosis/cytokinesis of the tachyzoite cell cycle. Remarkably, deletion of the AP2IV-4 locus resulted in the expression of a subset of bradyzoite-specific proteins in replicating tachyzoites that included tissue cyst wall components BPK1, MCP4, CST1 and the surface antigen SRS9. In the murine animal model, the mis-timing of bradyzoite antigens in tachyzoites lacking AP2IV-4 caused a potent inflammatory monocyte immune response that effectively eliminated this parasite and prevented tissue cyst formation in mouse brain tissue. Altogether, these results indicate that suppression of bradyzoite antigens by AP2IV-4 during acute infection is required for Toxoplasma to successfully establish a chronic infection in the immune-competent host

    Self-healing elastomer system

    Get PDF
    A composite material includes an elastomer matrix, a set of first capsules containing a polymerizer, and a set of second capsules containing a corresponding activator for the polymerizer. The polymerizer may be a polymerizer for an elastomer. The composite material may be prepared by combining a first set of capsules containing a polymerizer, a second set of capsules containing a corresponding activator for the polymerizer, and a matrix precursor, and then solidifying the matrix precursor to form an elastomeric matrix
    • 

    corecore