1,305 research outputs found
Improved variable selection with Forward-Lasso adaptive shrinkage
Recently, considerable interest has focused on variable selection methods in
regression situations where the number of predictors, , is large relative to
the number of observations, . Two commonly applied variable selection
approaches are the Lasso, which computes highly shrunk regression coefficients,
and Forward Selection, which uses no shrinkage. We propose a new approach,
"Forward-Lasso Adaptive SHrinkage" (FLASH), which includes the Lasso and
Forward Selection as special cases, and can be used in both the linear
regression and the Generalized Linear Model domains. As with the Lasso and
Forward Selection, FLASH iteratively adds one variable to the model in a
hierarchical fashion but, unlike these methods, at each step adjusts the level
of shrinkage so as to optimize the selection of the next variable. We first
present FLASH in the linear regression setting and show that it can be fitted
using a variant of the computationally efficient LARS algorithm. Then, we
extend FLASH to the GLM domain and demonstrate, through numerous simulations
and real world data sets, as well as some theoretical analysis, that FLASH
generally outperforms many competing approaches.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS375 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Functional linear regression that's interpretable
Regression models to relate a scalar to a functional predictor are
becoming increasingly common. Work in this area has concentrated on estimating
a coefficient function, , with related to through
. Regions where correspond to places where
there is a relationship between and . Alternatively, points where
indicate no relationship. Hence, for interpretation purposes, it
is desirable for a regression procedure to be capable of producing estimates of
that are exactly zero over regions with no apparent relationship and
have simple structures over the remaining regions. Unfortunately, most fitting
procedures result in an estimate for that is rarely exactly zero and
has unnatural wiggles making the curve hard to interpret. In this article we
introduce a new approach which uses variable selection ideas, applied to
various derivatives of , to produce estimates that are both
interpretable, flexible and accurate. We call our method "Functional Linear
Regression That's Interpretable" (FLiRTI) and demonstrate it on simulated and
real-world data sets. In addition, non-asymptotic theoretical bounds on the
estimation error are presented. The bounds provide strong theoretical
motivation for our approach.Comment: Published in at http://dx.doi.org/10.1214/08-AOS641 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Curve alignment by moments
A significant problem with most functional data analyses is that of
misaligned curves. Without adjustment, even an analysis as simple as estimation
of the mean will fail. One common method to synchronize a set of curves
involves equating ``landmarks'' such as peaks or troughs. The landmarks method
can work well but will fail if marker events can not be identified or are
missing from some curves. An alternative approach, the ``continuous monotone
registration'' method, works by transforming the curves so that they are as
close as possible to a target function. This method can also perform well but
is highly dependent on identifying an accurate target function. We develop an
alignment method based on equating the ``moments'' of a given set of curves.
These moments are intended to capture the locations of important features which
may represent local behavior, such as maximums and minimums, or more global
characteristics, such as the slope of the curve averaged over time. Our method
works by equating the moments of the curves while also shrinking toward a
common shape. This allows us to capture the advantages of both the landmark and
continuous monotone registration approaches. The method is illustrated on
several data sets and a simulation study is performed.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS127 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Sparse regulatory networks
In many organisms the expression levels of each gene are controlled by the
activation levels of known "Transcription Factors" (TF). A problem of
considerable interest is that of estimating the "Transcription Regulation
Networks" (TRN) relating the TFs and genes. While the expression levels of
genes can be observed, the activation levels of the corresponding TFs are
usually unknown, greatly increasing the difficulty of the problem. Based on
previous experimental work, it is often the case that partial information about
the TRN is available. For example, certain TFs may be known to regulate a given
gene or in other cases a connection may be predicted with a certain
probability. In general, the biology of the problem indicates there will be
very few connections between TFs and genes. Several methods have been proposed
for estimating TRNs. However, they all suffer from problems such as unrealistic
assumptions about prior knowledge of the network structure or computational
limitations. We propose a new approach that can directly utilize prior
information about the network structure in conjunction with observed gene
expression data to estimate the TRN. Our approach uses penalties on the
network to ensure a sparse structure. This has the advantage of being
computationally efficient as well as making many fewer assumptions about the
network structure. We use our methodology to construct the TRN for E. coli and
show that the estimate is biologically sensible and compares favorably with
previous estimates.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS350 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Exercise redox biochemistry:conceptual, methodological and technical recommendations
Exercise redox biochemistry is of considerable interest owing to its translational value in health and disease. However, unaddressed conceptual, methodological and technical issues complicate attempts to unravel how exercise alters redox homeostasis in health and disease. Conceptual issues relate to misunderstandings that arise when the chemical heterogeneity of redox biology is disregarded which often complicate attempts to use redox-active compounds and assess redox signalling. Further, that oxidised macromolecule adduct levels reflect formation and repair is seldom considered. Methodological and technical issues relate to the use of out-dated assays and/or inappropriate sample preparation techniques that confound biochemical redox analysis. After considering each of the aforementioned issues, we outline how each issue can be resolved and provide a unifying set of recommendations. We specifically recommend that investigators: consider chemical heterogeneity, use redox-active compounds judiciously, abandon flawed assays, carefully prepare samples and assay buffers, consider repair/metabolism, use multiple biomarkers to assess oxidative damage and redox signalling
Simple algorithm for the correction of MRI image artefacts due to random phase fluctuations
Grant support: This work was supported by EPSRC [grant numbers EP/E036775/1, EP/K020293/1] and received funding from the European Union's Horizon 2020 research and innovation programme [grant agreement No 668119, project “IDentIFY”]Peer reviewedPublisher PD
Sleep-dependent consolidation in children with comprehension and vocabulary weaknesses: it'll be alright on the night?
BACKGROUND: Vocabulary is crucial for an array of life outcomes and is frequently impaired in developmental disorders. Notably, 'poor comprehenders' (children with reading comprehension deficits but intact word reading) often have vocabulary deficits, but underlying mechanisms remain unclear. Prior research suggests intact encoding but difficulties consolidating new word knowledge. We test the hypothesis that poor comprehenders' sleep-associated vocabulary consolidation is compromised by their impoverished lexical-semantic knowledge. METHODS: Memory for new words was tracked across wake and sleep to assess encoding and consolidation in 8-to-12-year-old good and poor comprehenders. Each child participated in two sets of sessions, one beginning in the morning (AM-encoding) and the other in the evening (PM-encoding). In each case, they were taught 12 words and were trained on a spatial memory task. Memory was assessed immediately, 12- and 24-hr later via stem-completion, picture-naming, and definition tasks to probe different aspects of word knowledge. Long-term retention was assessed 1-2 months later. RESULTS: Recall of word-forms improved over sleep and postsleep wake, as measured in both stem-completion and picture-naming tasks. Counter to hypotheses, deficits for poor comprehenders were not observed in consolidation but instead were seen across measures and throughout testing, suggesting a deficit from encoding. Variability in vocabulary knowledge across the whole sample predicted sleep-associated consolidation, but only when words were learned early in the day and not when sleep followed soon after learning. CONCLUSIONS: Poor comprehenders showed weaker memory for new words than good comprehenders, but sleep-associated consolidation benefits were comparable between groups. Sleeping soon after learning had long-lasting benefits for memory and may be especially beneficial for children with weaker vocabulary. These results provide new insights into the breadth of poor comprehenders' vocabulary weaknesses, and ways in which learning might be better timed to remediate vocabulary difficulties
- …