18,184 research outputs found
GaGa: A parsimonious and flexible model for differential expression analysis
Hierarchical models are a powerful tool for high-throughput data with a small
to moderate number of replicates, as they allow sharing information across
units of information, for example, genes. We propose two such models and show
its increased sensitivity in microarray differential expression applications.
We build on the gamma--gamma hierarchical model introduced by Kendziorski et
al. [Statist. Med. 22 (2003) 3899--3914] and Newton et al. [Biostatistics 5
(2004) 155--176], by addressing important limitations that may have hampered
its performance and its more widespread use. The models parsimoniously describe
the expression of thousands of genes with a small number of hyper-parameters.
This makes them easy to interpret and analytically tractable. The first model
is a simple extension that improves the fit substantially with almost no
increase in complexity. We propose a second extension that uses a mixture of
gamma distributions to further improve the fit, at the expense of increased
computational burden. We derive several approximations that significantly
reduce the computational cost. We find that our models outperform the original
formulation of the model, as well as some other popular methods for
differential expression analysis. The improved performance is specially
noticeable for the small sample sizes commonly encountered in high-throughput
experiments. Our methods are implemented in the freely available Bioconductor
gaga package.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS244 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Computational effectiveness of LMI design strategies for vibration control of large structures
Distributed control systems for vibration control of large structures involve a large number of actuation devices and sensors that work coordinately to produce the desired control actions. Design strategies based on linear matrix inequality (LMI) formulations allow obtaining controllers for these complex control problems, which are characterized by large dimensionality, high computational cost and severe information constraints. In this paper, we conduct a comparative study of the computational effectiveness of three different LMI-based controller design strategies: H-infinity, energy-to-peak and energy-to-componentwise-peak. The H-infinity approach is a well-known design methodology and has been widely used in the literature. The
energy-to-peak approach is a particular case of generalized H2 design that is gaining a growing relevance in structural vibration control. Finally, the energy-to-componentwise-peak approach is a less common case of generalized H2 design that produces promising results among the three considered approaches. These controller design strategies are applied to synthesize active state-feedback controllers for the seismic protection of a five-story building and a twenty-story building both equipped with complete systems of interstory actuation devices. To evaluate the computational effectiveness of the proposed LMI design methodologies, the corresponding
computation times are compared and a suitable set of numerical simulations is carried out to assess the performance of the obtained controllers. As positive results, two main facts can be highlighted: the computational effectiveness of the energy-to-peak control design strategy
and the particularly well-balanced behavior exhibited by the energy-to-componentwise-peak controllers. On the negative side, it has to be mentioned the computational inefficiency of the considered LMI design methodologies to properly deal with very-large-scale control problems.Peer ReviewedPostprint (published version
On choosing mixture components via non-local priors
Choosing the number of mixture components remains an elusive challenge. Model
selection criteria can be either overly liberal or conservative and return
poorly-separated components of limited practical use. We formalize non-local
priors (NLPs) for mixtures and show how they lead to well-separated components
with non-negligible weight, interpretable as distinct subpopulations. We also
propose an estimator for posterior model probabilities under local and
non-local priors, showing that Bayes factors are ratios of posterior to prior
empty-cluster probabilities. The estimator is widely applicable and helps set
thresholds to drop unoccupied components in overfitted mixtures. We suggest
default prior parameters based on multi-modality for Normal/T mixtures and
minimal informativeness for categorical outcomes. We characterise theoretically
the NLP-induced sparsity, derive tractable expressions and algorithms. We fully
develop Normal, Binomial and product Binomial mixtures but the theory,
computation and principles hold more generally. We observed a serious lack of
sensitivity of the Bayesian information criterion (BIC), insufficient parsimony
of the AIC and a local prior, and a mixed behavior of the singular BIC. We also
considered overfitted mixtures, their performance was competitive but depended
on tuning parameters. Under our default prior elicitation NLPs offered a good
compromise between sparsity and power to detect meaningfully-separated
components
Sequential stopping for high-throughput experiments
In high-throughput experiments, the sample size is typically chosen informally. Most formal sample-size calculations depend critically on prior knowledge. We propose a sequential strategy that, by updating knowledge when new data are available, depends less critically on prior assumptions. Experiments are stopped or continued based on the potential benefits in obtaining additional data. The underlying decision-theoretic framework guarantees the design to proceed in a coherent fashion. We propose intuitively appealing, easy-to-implement utility functions. As in most sequential design problems, an exact solution is prohibitive. We propose a simulation-based approximation that uses decision boundaries. We apply the method to RNA-seq, microarray, and reverse-phase protein array studies and show its potential advantages. The approach has been added to the Bioconductor package gaga
Quantifying alternative splicing from paired-end RNA-sequencing data
RNA-sequencing has revolutionized biomedical research and, in particular, our
ability to study gene alternative splicing. The problem has important
implications for human health, as alternative splicing may be involved in
malfunctions at the cellular level and multiple diseases. However, the
high-dimensional nature of the data and the existence of experimental biases
pose serious data analysis challenges. We find that the standard data summaries
used to study alternative splicing are severely limited, as they ignore a
substantial amount of valuable information. Current data analysis methods are
based on such summaries and are hence suboptimal. Further, they have limited
flexibility in accounting for technical biases. We propose novel data summaries
and a Bayesian modeling framework that overcome these limitations and determine
biases in a nonparametric, highly flexible manner. These summaries adapt
naturally to the rapid improvements in sequencing technology. We provide
efficient point estimates and uncertainty assessments. The approach allows to
study alternative splicing patterns for individual samples and can also be the
basis for downstream analyses. We found a severalfold improvement in estimation
mean square error compared popular approaches in simulations, and substantially
higher consistency between replicates in experimental data. Our findings
indicate the need for adjusting the routine summarization and analysis of
alternative splicing RNA-seq studies. We provide a software implementation in
the R package casper.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS687 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org). With correction
Rossell v. County Bank
USDC for the District of Delawar
The educational effectiveness of bilingual education
Bilingual education is the use of the native tongue to instruct limited Englishspeaking children. The authors read studies of bilingual education from the earliest period of this literature to the most recent. Of the 300 program evaluations read, only 72 (25%) were methodologically acceptable - that is, they had a treatment and control group and a statistical control for pre-treatment differences where groups were not randomly assigned. Virtually all of the studies in the United States were of elementary or junior high school students and Spanish speakers; The few studies conducted outside the United States were almost all in Canada. The research evidence indicates that, on standardized achievement tests, transitional bilingual education (TBE) is better than regular classroom instruction in only 22% of the methodologically acceptable studies when the outcome is reading, 7% of the studies when the outcome is language, and 9% of the studies when the outcome is math. TBE is never better than structured immersion, a special program for limited English proficient children where the children are in a self-contained classroom composed solely of English learners, but the instruction is in English at a pace they can understand. Thus, the research evidence does not support transitional bilingual education as a superior form of instruction for limited English proficient children
A design procedure for overlapped guaranteed cost controllers
© 2008 the authors. This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-NDIn this paper a quadratic guaranteed cost control problem for a class of linear continuous-time state-delay systems with norm-bounded uncertainties is considered. We will suppose that the systems are composed by two overlapped subsystems but the results can be easily extended to any number of subsystems. The main objective is to design overlapping guaranteed cost controllers with tridiagonal gain matrices for these kind of systems by using a linear matrix inequality (LMI) approach. With this idea in mind, we present a design strategy to reduce the computational burden and to increase the feasibility in the LMI problem. In this context, the use of so-called complementary matrices play an important role. A simple example to illustrate the advantages achieved by using the proposed method is supplied.Peer ReviewedPostprint (published version
- …