12,898 research outputs found
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
Constructing confidence intervals for the coefficients of high-dimensional
sparse linear models remains a challenge, mainly because of the complicated
limiting distributions of the widely used estimators, such as the lasso.
Several methods have been developed for constructing such intervals. Bootstrap
lasso+ols is notable for its technical simplicity, good interpretability, and
performance that is comparable with that of other more complicated methods.
However, bootstrap lasso+ols depends on the beta-min assumption, a theoretic
criterion that is often violated in practice. Thus, we introduce a new method,
called bootstrap lasso+partial ridge, to relax this assumption. Lasso+partial
ridge is a two-stage estimator. First, the lasso is used to select features.
Then, the partial ridge is used to refit the coefficients. Simulation results
show that bootstrap lasso+partial ridge outperforms bootstrap lasso+ols when
there exist small, but nonzero coefficients, a common situation that violates
the beta-min assumption. For such coefficients, the confidence intervals
constructed using bootstrap lasso+partial ridge have, on average, larger
coverage probabilities than those of bootstrap lasso+ols. Bootstrap
lasso+partial ridge also has, on average, shorter confidence interval
lengths than those of the de-sparsified lasso methods, regardless of whether
the linear models are misspecified. Additionally, we provide theoretical
guarantees for bootstrap lasso+partial ridge under appropriate conditions, and
implement it in the R package "HDCI.
MSIQ: Joint Modeling of Multiple RNA-seq Samples for Accurate Isoform Quantification
Next-generation RNA sequencing (RNA-seq) technology has been widely used to
assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq
data offer insight into gene expression levels and transcriptome structures,
enabling us to better understand the regulation of gene expression and
fundamental biological processes. Accurate isoform quantification from RNA-seq
data is challenging due to the information loss in sequencing experiments. A
recent accumulation of multiple RNA-seq data sets from the same tissue or cell
type provides new opportunities to improve the accuracy of isoform
quantification. However, existing statistical or computational methods for
multiple RNA-seq samples either pool the samples into one sample or assign
equal weights to the samples when estimating isoform abundance. These methods
ignore the possible heterogeneity in the quality of different samples and could
result in biased and unrobust estimates. In this article, we develop a method,
which we call "joint modeling of multiple RNA-seq samples for accurate isoform
quantification" (MSIQ), for more accurate and robust isoform quantification by
integrating multiple RNA-seq samples under a Bayesian framework. Our method
aims to (1) identify a consistent group of samples with homogeneous quality and
(2) improve isoform quantification accuracy by jointly modeling multiple
RNA-seq samples by allowing for higher weights on the consistent group. We show
that MSIQ provides a consistent estimator of isoform abundance, and we
demonstrate the accuracy and effectiveness of MSIQ compared with alternative
methods through simulation studies on D. melanogaster genes. We justify MSIQ's
advantages over existing approaches via application studies on real RNA-seq
data from human embryonic stem cells, brain tissues, and the HepG2 immortalized
cell line
Hubble Space Telescope Observations of 3200 Phaethon At Closest Approach
We present Hubble Space Telescope observations of the active asteroid (and
Geminid stream parent) 3200 Phaethon when at its closest approach to Earth
(separation 0.07 AU) in 2017 December. Images were recorded within
1\degr~of the orbital plane, providing extra sensitivity to low surface
brightness caused by scattering from a large-particle trail. We placed an upper
limit to the apparent surface brightness of such a trail at 27.2 magnitudes
arcsecond, corresponding to an in-plane optical depth . No co-moving sources brighter than absolute magnitude 26.3,
corresponding to circular equivalent radius 12 m (albedo 0.12 assumed),
were detected. Phaethon is too hot for near-surface ice to survive. We briefly
consider the thermodynamic stability of deeply-buried ice, finding that its
survival would require either a very small (regolith-like) thermal diffusivity
( m s), or the unexpectedly recent injection of Phaethon
(timescale 10 yr) into its present orbit, or both.Comment: Improved the discussion of optical depth calculation and corrected an
error in the previous version. 28 pages, 5 figures, Astronomical Journal, in
pres
TROM: A Testing-based Method for Finding Transcriptomic Similarity of Biological Samples
Comparative transcriptomics has gained increasing popularity in genomic
research thanks to the development of high-throughput technologies including
microarray and next-generation RNA sequencing that have generated numerous
transcriptomic data. An important question is to understand the conservation
and differentiation of biological processes in different species. We propose a
testing-based method TROM (Transcriptome Overlap Measure) for comparing
transcriptomes within or between different species, and provide a different
perspective to interpret transcriptomic similarity in contrast to traditional
correlation analyses. Specifically, the TROM method focuses on identifying
associated genes that capture molecular characteristics of biological samples,
and subsequently comparing the biological samples by testing the overlap of
their associated genes. We use simulation and real data studies to demonstrate
that TROM is more powerful in identifying similar transcriptomes and more
robust to stochastic gene expression noise than Pearson and Spearman
correlations. We apply TROM to compare the developmental stages of six
Drosophila species, C. elegans, S. purpuratus, D. rerio and mouse liver, and
find interesting correspondence patterns that imply conserved gene expression
programs in the development of these species. The TROM method is available as
an R package on CRAN (http://cran.r-project.org/) with manuals and source codes
available at http://www.stat.ucla.edu/ jingyi.li/software-and-data/trom.html
The Dust Tail of Asteroid (3200) Phaethon
We report the discovery of a comet-like tail on asteroid (3200) Phaethon when
imaged at optical wavelengths near perihelion. In both 2009 and 2012, the tail
appears >=350" (2.5x10^8 m) in length and extends approximately in the
projected anti-solar direction. We interpret the tail as being caused by dust
particles accelerated by solar radiation pressure. The sudden appearance and
the morphology of the tail indicate that the dust particles are small, with an
effective radius ~1 micrometer and a combined mass ~3x10^5 kg. These particles
are likely products of thermal fracture and/or desiccation cracking under the
very high surface temperatures (~1000 K) experienced by Phaethon at perihelion.
The existence of the tail confirms earlier inferences about activity in this
body based on the detection of anomalous brightening. Phaethon, the presumed
source of the Geminid meteoroids, is still active.Comment: 13 pages, 4 figures. Accepted by ApJ
Anatomy of an Asteroid Break-Up: The Case of P/2013 R3
We present an analysis of new and published data on P/2013 R3, the first
asteroid detected while disintegrating. Thirteen discrete components are
measured in the interval between UT 2013 October 01 and 2014 February 13. We
determine a mean, pair-wise velocity dispersion amongst these components of
m s and find that their separation times are
staggered over an interval of 5 months. Dust enveloping the system has,
in the first observations, a cross-section 30 km but fades
monotonically at a rate consistent with the action of radiation pressure
sweeping. The individual components exhibit comet-like morphologies and also
fade except where secondary fragmentation is accompanied by the release of
additional dust. We find only upper limits to the radii of any embedded solid
nuclei, typically 100 to 200 m (geometric albedo 0.05 assumed). Combined,
the components of P/2013 R3 would form a single spherical body with radius
400 m, which is our best estimate of the size of the precursor
object. The observations are consistent with rotational disruption of a weak
(cohesive strength 50 to 100 N m) parent body, 400 m in
radius. Estimated radiation (YORP) spin-up times of this parent are 1
Myr, shorter than the collisional lifetime. If present, water ice sublimating
at as little as 10 kg s could generate a torque on the parent
body rivaling the YORP torque. Under conservative assumptions about the
frequency of similar disruptions, the inferred asteroid debris production rate
is 10 kg s, which is at least 4% of the rate needed to
maintain the Zodiacal Cloud.Comment: 44 pages, 13 figures, accepted by Astronomical Journa
Quantification of three-dimensional folding using fluvial terraces: A case study from the Mushi anticline, northern margin of the Chinese Pamir
Fold deformation in three dimensions involves shortening, uplift, and lateral growth. Fluvial terraces represent strain markers that have been widely applied to constrain a fold's shortening and uplift. For the lateral growth, however, the utility of fluvial terraces has been commonly ignored. Situated along northern margin of Chinese Pamir, the Mushi anticline preserves, along its northern flank, flights of passively deformed fluvial terraces that can be used to constrain three-dimensional folding history, especially lateral growth. The Mushi anticline is a geometrically simple fault-tip fold with a total shortening of 740?±?110?m and rock uplift of ~1300?m. Geologic and geomorphic mapping and dGPS surveys reveal that terrace surfaces perpendicular to the fold's strike display increased rotation with age, implying the fold grows by progressive limb rotation. We use a pure-shear fault-tip fold model to estimate a uniform shortening rate of 1.5?+?1.3/?0.5?mm/a and a rock-uplift rate of 2.3?+?2.1/?0.8?mm/a. Parallel to the fold's strike, longitudinal profiles of terrace surfaces also display age-dependent increases in slopes. We present a new model to distinguish lateral growth mechanisms (lateral lengthening and/or rotation above a fixed tip). This model indicates that eastward lengthening of the Mushi anticline ceased by at least ~134?ka and its lateral growth has been dominated by rotation. Our study confirms that terrace deformation along a fold's strike not only can constrain the lateral lengthening rate but can serve to quantify the magnitude and rate of lateral rotation: attributes that are commonly difficult to define when relying on other geomorphic criteria
Functional Genomics Profiling of Bladder Urothelial Carcinoma MicroRNAome as a Potential Biomarker.
Though bladder urothelial carcinoma is the most common form of bladder cancer, advances in its diagnosis and treatment have been modest in the past few decades. To evaluate miRNAs as putative disease markers for bladder urothelial carcinoma, this study develops a process to identify dysregulated miRNAs in cancer patients and potentially stratify patients based on the association of their microRNAome phenotype to genomic alterations. Using RNA sequencing data for 409 patients from the Cancer Genome Atlas, we examined miRNA differential expression between cancer and normal tissues and associated differentially expressed miRNAs with patient survival and clinical variables. We then correlated miRNA expressions with genomic alterations using the Wilcoxon test and REVEALER. We found a panel of six miRNAs dysregulated in bladder cancer and exhibited correlations to patient survival. We also performed differential expression analysis and clinical variable correlations to identify miRNAs associated with tobacco smoking, the most important risk factor for bladder cancer. Two miRNAs, miR-323a and miR-431, were differentially expressed in smoking patients compared to nonsmoking patients and were associated with primary tumor size. Functional studies of these miRNAs and the genomic features we identified for potential stratification may reveal underlying mechanisms of bladder cancer carcinogenesis and further diagnosis and treatment methods for urothelial bladder carcinoma
Disintegrating Asteroid P/2013 R3
Splitting of the nuclei of comets into multiple components has been
frequently observed but, to date, no main-belt asteroid has been observed to
break-up. Using the Hubble Space Telescope, we find that main-belt asteroid
P/2013 R3 consists of 10 or more distinct components, the largest up to 200 m
in radius (assumed geometric albedo of 0.05) each of which produces a coma and
comet-like dust tail. A diffuse debris cloud with total mass roughly 2x10^8 kg
further envelopes the entire system. The velocity dispersion among the
components is about V = 0.2 to 0.5 m/s, is comparable to the gravitational
escape speeds of the largest members, while their extrapolated plane-of-sky
motions suggest break-up between February and September 2013. The broadband
optical colors are those of a C-type asteroid. We find no spectral evidence for
gaseous emission, placing model-dependent upper limits to the water production
rate near 1 kg/s. Breakup may be due to a rotationally induced structural
failure of the precursor body.Comment: 16 pages, 3 figures; accepted by ApJ
- …
