44,156 research outputs found
Application of methods for central statistical monitoring in clinical trials
Background On-site source data verification is a common and expensive activity, with little evidence that it is worthwhile. Central statistical monitoring (CSM) is a cheaper alternative, where data checks are performed by the coordinating centre, avoiding the need to visit all sites. Several publications have suggested methods for CSM; however, few have described their use in real trials.
Methods R-programs were created to check data at either the subject level (7 tests within 3 programs) or site level (9 tests within 8 programs) using previously described methods or new ones we developed. These aimed to find possible data errors such as outliers, incorrect dates, or anomalous data patterns; digit preference, values too close or too far from the means, unusual correlation structures, extreme variances which may indicate fraud or procedural errors and under-reporting of adverse events. The methods were applied to three trials, one of which had closed and has been published, one in follow-up, and a third to which fabricated data were added. We examined how well the methods work, discussing their strengths and limitations.
Results The R-programs produced simple tables or easy-to-read figures. Few data errors were found in the first two trials, and those added to the third were easily detected. The programs were able to identify patients with outliers based on single or multiple variables. They also detected (1) fabricated patients, generated to have values too close to the multivariate mean, or with too low variances in repeated measurements, and (2) sites which had unusual correlation structures or too few adverse events. Some methods were unreliable if applied to centres with few patients or if data were fabricated in a way which did not fit the assumptions used to create the programs. Outputs from the R-programs are interpreted using examples.
Limitations Detecting data errors is relatively straightforward; however, there are several limitations in the detection of fraud: some programs cannot be applied to small trials or to centres with few patients (<10) and data falsified in a manner which does not fit the program’s assumptions may not be detected. In addition, many tests require a visual assessment of the output (showing flagged participants or sites), before data queries are made or on-site visits performed.
Conclusions CSM is a worthwhile alternative to on-site data checking and may be used to limit the number of site visits by targeting only sites which are picked up by the programs. We summarise the methods, show how they are implemented and that they can be easy to interpret. The methods can identify incorrect or unusual data for a trial subject, or centres where the data considered together are too different to other centres and therefore should be reviewed, possibly through an on-site visit
Anomalous aging phenomena caused by drift velocities
We demonstrate via several examples that a uniform drift velocity gives rise
to anomalous aging, characterized by a specific form for the two-time
correlation functions, in a variety of statistical-mechanical systems far from
equilibrium. Our first example concerns the oscillatory phase observed recently
in a model of competitive learning. Further examples, where the proposed theory
is exact, include the voter model and the Ohta-Jasnow-Kawasaki theory for
domain growth in any dimension, and a theory for the smoothing of sandpile
surfaces.Comment: 7 pages, 3 figures. To appear in Europhysics Letter
Getting the Measure of the Flatness Problem
The problem of estimating cosmological parameters such as from noisy
or incomplete data is an example of an inverse problem and, as such, generally
requires a probablistic approach. We adopt the Bayesian interpretation of
probability for such problems and stress the connection between probability and
information which this approach makes explicit.
This connection is important even when information is ``minimal'' or, in
other words, when we need to argue from a state of maximum ignorance. We use
the transformation group method of Jaynes to assign minimally--informative
prior probability measure for cosmological parameters in the simple example of
a dust Friedman model, showing that the usual statements of the cosmological
flatness problem are based on an inappropriate choice of prior. We further
demonstrate that, in the framework of a classical cosmological model, there is
no flatness problem.Comment: 11 pages, submitted to Classical and Quantum Gravity, Tex source
file, no figur
Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform
Motivation
The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for
compression and indexing of text data, but the cost of computing the BWT of
very large string collections has prevented these techniques from being widely
applied to the large sets of sequences often encountered as the outcome of DNA
sequencing experiments. In previous work, we presented a novel algorithm that
allows the BWT of human genome scale data to be computed on very moderate
hardware, thus enabling us to investigate the BWT as a tool for the compression
of such datasets.
Results
We first used simulated reads to explore the relationship between the level
of compression and the error rate, the length of the reads and the level of
sampling of the underlying genome and compare choices of second-stage
compression algorithm.
We demonstrate that compression may be greatly improved by a particular
reordering of the sequences in the collection and give a novel `implicit
sorting' strategy that enables these benefits to be realised without the
overhead of sorting the reads. With these techniques, a 45x coverage of real
human genome sequence data compresses losslessly to under 0.5 bits per base,
allowing the 135.3Gbp of sequence to fit into only 8.2Gbytes of space (trimming
a small proportion of low-quality bases from the reads improves the compression
still further).
This is more than 4 times smaller than the size achieved by a standard
BWT-based compressor (bzip2) on the untrimmed reads, but an important further
advantage of our approach is that it facilitates the building of compressed
full text indexes such as the FM-index on large-scale DNA sequence collections.Comment: Version here is as submitted to Bioinformatics and is same as the
previously archived version. This submission registers the fact that the
advanced access version is now available at
http://bioinformatics.oxfordjournals.org/content/early/2012/05/02/bioinformatics.bts173.abstract
. Bioinformatics should be considered as the original place of publication of
this article, please cite accordingl
Scaling of gauge balls and static potential in the confinement phase of the pure U(1) lattice gauge theory
We investigate the scaling behaviour of gauge-ball masses and static
potential in the pure U(1) lattice gauge theory on toroidal lattices. An
extended gauge field action is used with and -0.5. Gauge-ball correlation
functions with all possible lattice quantum numbers are calculated. Most
gauge-ball masses scale with the non-Gaussian exponent .
The gauge-ball mass scales with the Gaussian value in the investigated range of correlation lengths. The static potential is
examined with Sommer's method. The long range part scales consistently with
but the short range part tends to yield smaller values of . The
-function, having a UV stable zero, is obtained from the running
coupling. These results hold for both values, supporting universality.
Consequences for the continuum limit of the theory are discussed.Comment: Contribution to the Lattice 97 proceedings, LaTeX, 3 pages, 3 figure
A Bayesian Analogue of Gleason's Theorem
We introduce a novel notion of probability within quantum history theories
and give a Gleasonesque proof for these assignments. This involves introducing
a tentative novel axiom of probability. We also discuss how we are to interpret
these generalised probabilities as partially ordered notions of preference and
we introduce a tentative generalised notion of Shannon entropy. A Bayesian
approach to probability theory is adopted throughout, thus the axioms we use
will be minimal criteria of rationality rather than ad hoc mathematical axioms.Comment: 14 pages, v2: minor stylistic changes, v3: changes made in-line with
to-be-published versio
Molecular Carbon Chains and Rings in TMC-1
We present mapping results in several rotational transitions of HC3N, C6H,
both cyclic and linear C3H2 and C3H, towards the cyanopolyyne peak of the
filamentary dense cloud TMC-1 using the IRAM 30m and MPIfR 100m telescopes. The
spatial distribution of the cumulene carbon chain propadienylidene H2C3
(hereafter l-C3H2) is found to deviate significantly from the distributions of
the cyclic isomer c-C3H2, HC3N, and C6H which in turn look very similar. The
cyclic over linear abundance ratio of C3H2 increases by a factor of 3 across
the filament, with a value of 28 at the cyanopolyyne peak. This abundance ratio
is an order of magnitude larger than the range (3 to 5) we observed in the
diffuse interstellar medium. The cyclic over linear abundance ratio of C3H also
varies by ~2.5 in TMC-1, reaching a maximum value (13) close to the
cyanopolyyne peak. These behaviors might be related to competitive processes
between ion-neutral and neutral-neutral reactions for cyclic and linear
species.Comment: Accepted for publication in The Astrophysical Journal, part I. 24
pages, including 4 tables, 7 figures, and figure caption
Universality of the gauge-ball spectrum of the four-dimensional pure U(1) gauge theory
We continue numerical studies of the spectrum of the pure U(1) lattice gauge
theory in the confinement phase, initiated in our previous work. Using the
extended Wilson action we address the question of universality of the phase
transition line in the () plane between the confinement and the
Coulomb phases. Our present results at for the gauge-ball
spectrum are fully consistent with the previous results obtained at . Again, two different correlation length exponents,
and , are obtained in different channels. We also confirm
the stability of the values of these exponents with respect to the variation of
the distance from the critical point at which they are determined. These
results further demonstrate universal critical behaviour of the model at least
up to correlation lengths of 4 lattice spacings when the phase transition is
approached in some interval at .Comment: 16 page
- …