1,853 research outputs found
The Lasso for High-Dimensional Regression with a Possible Change-Point
We consider a high-dimensional regression model with a possible change-point
due to a covariate threshold and develop the Lasso estimator of regression
coefficients as well as the threshold parameter. Our Lasso estimator not only
selects covariates but also selects a model between linear and threshold
regression models. Under a sparsity assumption, we derive non-asymptotic oracle
inequalities for both the prediction risk and the estimation loss for
regression coefficients. Since the Lasso estimator selects variables
simultaneously, we show that oracle inequalities can be established without
pretesting the existence of the threshold effect. Furthermore, we establish
conditions under which the estimation error of the unknown threshold parameter
can be bounded by a nearly factor even when the number of regressors
can be much larger than the sample size (). We illustrate the usefulness of
our proposed estimation method via Monte Carlo simulations and an application
to real data
Factor-Driven Two-Regime Regression
We propose a novel two-regime regression model where regime switching is
driven by a vector of possibly unobservable factors. When the factors are
latent, we estimate them by the principal component analysis of a panel data
set. We show that the optimization problem can be reformulated as mixed integer
optimization, and we present two alternative computational algorithms. We
derive the asymptotic distribution of the resulting estimator under the scheme
that the threshold effect shrinks to zero. In particular, we establish a phase
transition that describes the effect of first-stage factor estimation as the
cross-sectional dimension of panel data increases relative to the time-series
dimension. Moreover, we develop bootstrap inference and illustrate our methods
via numerical studies
Testing for threshold effects in regression models
In this article, we develop a general method for testing threshold effects in regression models, using sup-likelihood-ratio (LR)-type statistics. Although the sup-LR-type test statistic has been considered in the literature, our method for establishing the asymptotic null distribution is new and nonstandard. The standard approach in the literature for obtaining the asymptotic null distribution requires that there exist a certain quadratic approximation to the objective function. The article provides an alternative, novel method that can be used to establish the asymptotic null distribution, even when the usual quadratic approximation is intractable. We illustrate the usefulness of our approach in the examples of the maximum score estimation, maximum likelihood estimation, quantile regression, and maximum rank correlation estimation. We establish consistency and local power properties of the test. We provide some simulation results and also an empirical application to tipping in racial segregation. This article has supplementary materials online.
Fast Inference for Quantile Regression with Tens of Millions of Observations
Big data analytics has opened new avenues in economic research, but the
challenge of analyzing datasets with tens of millions of observations is
substantial. Conventional econometric methods based on extreme estimators
require large amounts of computing resources and memory, which are often not
readily available. In this paper, we focus on linear quantile regression
applied to ``ultra-large'' datasets, such as U.S. decennial censuses. A fast
inference framework is presented, utilizing stochastic sub-gradient descent
(S-subGD) updates. The inference procedure handles cross-sectional data
sequentially: (i) updating the parameter estimate with each incoming "new
observation", (ii) aggregating it as a Polyak-Ruppert average, and (iii)
computing a pivotal statistic for inference using only a solution path. The
methodology draws from time series regression to create an asymptotically
pivotal statistic through random scaling. Our proposed test statistic is
calculated in a fully online fashion and critical values are calculated without
resampling. We conduct extensive numerical studies to showcase the
computational merits of our proposed inference. For inference problems as large
as , where is the sample size and is the
number of regressors, our method generates new insights, surpassing current
inference methods in computation. Our method specifically reveals trends in the
gender gap in the U.S. college wage premium using millions of observations,
while controlling over covariates to mitigate confounding effects.Comment: 45 pages, 6 figure
Fast and Robust Online Inference with Stochastic Gradient Descent via Random Scaling
We develop a new method of online inference for a vector of parameters
estimated by the Polyak-Ruppert averaging procedure of stochastic gradient
descent (SGD) algorithms. We leverage insights from time series regression in
econometrics and construct asymptotically pivotal statistics via random
scaling. Our approach is fully operational with online data and is rigorously
underpinned by a functional central limit theorem. Our proposed inference
method has a couple of key advantages over the existing methods. First, the
test statistic is computed in an online fashion with only SGD iterates and the
critical values can be obtained without any resampling methods, thereby
allowing for efficient implementation suitable for massive online data. Second,
there is no need to estimate the asymptotic variance and our inference method
is shown to be robust to changes in the tuning parameters for SGD algorithms in
simulation experiments with synthetic data.Comment: 16 pages, 5 figures, 5 table
HMGB1, a potential regulator of tumor microenvironment in KSHV-infected endothelial cells
High-mobility group box 1 (HMGB1) is a protein that binds to DNA and participates in various cellular processes, including DNA repair, transcription, and inflammation. It is also associated with cancer progression and therapeutic resistance. Despite its known role in promoting tumor growth and immune evasion in the tumor microenvironment, the contribution of HMGB1 to the development of Kaposiās sarcoma (KS) is not well understood. We investigated the effect of HMGB1 on KS pathogenesis using immortalized human endothelial cells infected with Kaposiās sarcoma-associated human herpes virus (KSHV). Our results showed that a higher amount of HMGB1 was detected in the supernatant of KSHV-infected cells compared to that of mock-infected cells, indicating that KSHV infection induced the secretion of HMGB1 in human endothelial cells. By generating HMGB1 knockout clones from immortalized human endothelial cells using CRISPR/Cas9, we elucidated the role of HMGB1 in KSHV-infected endothelial cells. Our findings indicate that the absence of HMGB1 did not induce lytic replication in KSHV-infected cells, but the cell viability of KSHV-infected cells was decreased in both 2D and 3D cultures. Through the antibody array for cytokines and growth factors, CXCL5, PDGF-AA, G-CSF, Emmprin, IL-17A, and VEGF were found to be suppressed in HMGB1 KO KSHV-infected cells compared to the KSHV-infected wild-type control. Mechanistically, phosphorylation of p38 would be associated with transcriptional regulation of CXCL5, PDGF-A and VEGF. These observations suggest that HMGB1 may play a critical role in KS pathogenesis by regulating cytokine and growth factor secretion and emphasize its potential as a therapeutic target for KS by modulating the tumor microenvironment
- ā¦