8,473 research outputs found
Compressing DNA sequence databases with coil
Background: Publicly available DNA sequence databases such as GenBank are large, and are
growing at an exponential rate. The sheer volume of data being dealt with presents serious storage
and data communications problems. Currently, sequence data is usually kept in large "flat files,"
which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which
rarely achieves good compression ratios. While much research has been done on compressing
individual DNA sequences, surprisingly little has focused on the compression of entire databases
of such sequences. In this study we introduce the sequence database compression software coil.
Results: We have designed and implemented a portable software package, coil, for compressing
and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared
towards achieving high compression ratios at the expense of execution time and memory usage
during compression – the compression time represents a "one-off investment" whose cost is
quickly amortised if the resulting compressed file is transmitted many times. Decompression
requires little memory and is extremely fast. We demonstrate a 5% improvement in compression
ratio over state-of-the-art general-purpose compression tools for a large GenBank database file
containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental
additions to a sequence database.
Conclusion: coil presents a compelling alternative to conventional compression of flat files for the
storage and distribution of DNA sequence databases having a narrow distribution of sequence
lengths, such as EST data. Increasing compression levels for databases having a wide distribution of
sequence lengths is a direction for future work
France and the Bretton Woods International Monetary System: 1960-1968
We reinterpret the commonly held view in the U.S. that France, by following a policy from 1965 to 1968 of deliberately converting their dollar holdings into gold helped perpetuate the collapse of the Bretton Woods International Monetary System. We argue that French international monetary policy under Charles de Gaulle was consistent with strategies developed in the interwar period and the French Plan of 1943. France used proposals to return to an orthodox gold standard as well as conversions of its dollar reserves into gold as tactical threats to induce the United States to initiate the reform of the international monetary system towards a more symmetrical and cooperative gold-exchange standard regime.
Target enrichment of ultraconserved elements from arthropods provides a genomic perspective on relationships among Hymenoptera
Gaining a genomic perspective on phylogeny requires the collection of data
from many putatively independent loci collected across the genome. Among
insects, an increasingly common approach to collecting this class of data
involves transcriptome sequencing, because few insects have high-quality genome
sequences available; assembling new genomes remains a limiting factor; the
transcribed portion of the genome is a reasonable, reduced subset of the genome
to target; and the data collected from transcribed portions of the genome are
similar in composition to the types of data with which biologists have
traditionally worked (e.g., exons). However, molecular techniques requiring RNA
as a template are limited to using very high quality source materials, which
are often unavailable from a large proportion of biologically important insect
samples. Recent research suggests that DNA-based target enrichment of conserved
genomic elements offers another path to collecting phylogenomic data across
insect taxa, provided that conserved elements are present in and can be
collected from insect genomes. Here, we identify a large set (n1510) of
ultraconserved elements (UCE) shared among the insect order Hymenoptera. We use
in silico analyses to show that these loci accurately reconstruct relationships
among genome-enabled Hymenoptera, and we design a set of baits for enriching
these loci that researchers can use with DNA templates extracted from a variety
of sources. We use our UCE bait set to enrich an average of 721 UCE loci from
30 hymenopteran taxa, and we use these UCE loci to reconstruct phylogenetic
relationships spanning very old (220 MYA) to very young (1 MYA)
divergences among hymenopteran lineages. In contrast to a recent study
addressing hymenopteran phylogeny using transcriptome data, we found ants to be
sister to all remaining aculeate lineages with complete support
Convictions Based on Character: An Empirical Test of Other-Acts Evidence
Despite the time-honored judicial principle that “we try cases, rather than persons,” courts routinely allow prosecutors to use defendants’ prior, unrelated bad acts at trial. Courts acknowledge that jurors could improperly use this other acts evidence as proof of the defendant’s bad character. However, courts theorize that if the other acts are also relevant for a permissible purpose—such as proving the defendant’s identity as the perpetrator of the charged crime—then a cautionary instruction will cure the problem, and any prejudice is “presumed erased from the jury’s mind.” We put this judicial assumption to an empirical test. We recruited 249 participants to serve as mock jurors in a hypothetical criminal case. After reading the identical case summary, jurors were randomly assigned to one of two groups, each of which received different evidence on the issue of identity. Group A received conclusive proof, in the form of a stipulation, that if a crime was committed, the defendant was the one who committed it. Group A convicted at the rate of 33.1%. Group B received less certain evidence of identity in the form of the defendant’s somewhat similar, prior conviction, along with a cautionary instruction that this other act may not be used as evidence of the defendant’s character. Group B convicted at the much higher rate of 48.0%. The difference in conviction rates is statistically significant. Further, jurors in Group B were also more confident in their verdicts despite receiving less certain evidence of guilt and a cautionary instruction. These empirical findings demonstrate that cautionary instructions are not effective, and jurors will use other-acts evidence for impermissible purposes including, for example, the forbidden character inference. Given this, we discuss several pretrial strategies for defense counsel to limit the prejudicial impact of other-acts evidence
The Spatial and Kinematic Distributions of Cluster Galaxies in a LCDM Universe -- Comparison with Observations
We combine dissipationless N-body simulations and semi-analytic models of
galaxy formation to study the spatial and kinematic distributions of cluster
galaxies in a LCDM cosmology. We investigate how the star formation rates,
colours and morphologies of galaxies vary as a function of distance from the
cluster centre and compare our results with the CNOC1 survey of galaxies from
15 X-ray luminous clusters in the redshift range 0.18 to 0.55. In our model,
gas no longer cools onto galaxies after they fall into the cluster and their
star formation rates decline on timescales of 1-2 Gyr. Galaxies in cluster
cores have lower star formation rates and redder colours than galaxies in the
outer regions because they were accreted earlier. Our colour and star formation
gradients agree with those those derived from the data. The difference in
velocity dispersions between red and blue galaxies observed in the CNOC1
clusters is also well reproduced by the model. We assume that the morphologies
of cluster galaxies are determined solely by their merging histories.
Morphology gradients in clusters arise naturally, with the fraction of bulge-
dominated galaxies highest in cluster cores. We compare these gradients with
the CNOC1 data and find excellent agreement for bulge-dominated galaxies. The
simulated clusters contain too few galaxies of intermediate bulge-to-disk
ratio, suggesting that additional processes may influence the morphological
evolution of disk-dominated galaxies in clusters. Although the properties of
the cluster galaxies in our model agree extremely well with the data, the same
is not true of field galaxies. Both the star formation rates and the colours of
bright field galaxies appear to evolve much more strongly from redshift 0.2 to
0.4 in the CNOC1 field sample than in our simulations.Comment: 17 pages, sumitted to MNRAS. Simulation outputs, halo catalogs,
merger trees and galaxy catalogs are now available at
http://www.mpa-garching.mpg.de/GIF
- …