19,271 research outputs found
Tropical Convexity and Canonical Projections
Using a potential theory on metric graphs "Gamma", we introduce the notion of
tropical convexity to the space "RDiv^d(Gamma)" of effective R-divisors of
degree d on "Gamma" and show that a natural metric can be defined on
"RDiv^d(Gamma)". In addition, we extend the notion of reduced divisors which is
conventionally defined in a complete linear system |D| with respect to a single
point in "Gamma". In our general setting, a reduced divisor is defined uniquely
as an R-divisor in a compact tropical convex subset "T" of "RDiv^d(Gamma)" with
respect to a certain R-divisor "E" of the same degree d. In this sense, we
consider reduced divisors as canonical projections onto "T". We also
investigate some basic properties of tropical convex sets using techniques
developed from general reduced divisors.Comment: 29 pages, 2 figure
High-Dimensional Boosting: Rate of Convergence
Boosting is one of the most significant developments in machine learning.
This paper studies the rate of convergence of Boosting, which is tailored
for regression, in a high-dimensional setting. Moreover, we introduce so-called
\textquotedblleft post-Boosting\textquotedblright. This is a post-selection
estimator which applies ordinary least squares to the variables selected in the
first stage by Boosting. Another variant is \textquotedblleft Orthogonal
Boosting\textquotedblright\ where after each step an orthogonal projection is
conducted. We show that both post-Boosting and the orthogonal boosting
achieve the same rate of convergence as LASSO in a sparse, high-dimensional
setting. We show that the rate of convergence of the classical Boosting
depends on the design matrix described by a sparse eigenvalue constant. To show
the latter results, we derive new approximation results for the pure greedy
algorithm, based on analyzing the revisiting behavior of Boosting. We also
introduce feasible rules for early stopping, which can be easily implemented
and used in applied work. Our results also allow a direct comparison between
LASSO and boosting which has been missing from the literature. Finally, we
present simulation studies and applications to illustrate the relevance of our
theoretical results and to provide insights into the practical aspects of
boosting. In these simulation studies, post-Boosting clearly outperforms
LASSO.Comment: 19 pages, 4 tables; AMS 2000 subject classifications: Primary 62J05,
62J07, 41A25; secondary 49M15, 68Q3
Multi-consensus Decentralized Accelerated Gradient Descent
This paper considers the decentralized optimization problem, which has
applications in large scale machine learning, sensor networks, and control
theory. We propose a novel algorithm that can achieve near optimal
communication complexity, matching the known lower bound up to a logarithmic
factor of the condition number of the problem. Our theoretical results give
affirmative answers to the open problem on whether there exists an algorithm
that can achieve a communication complexity (nearly) matching the lower bound
depending on the global condition number instead of the local one. Moreover,
the proposed algorithm achieves the optimal computation complexity matching the
lower bound up to universal constants. Furthermore, to achieve a linear
convergence rate, our algorithm \emph{doesn't} require the individual functions
to be (strongly) convex. Our method relies on a novel combination of known
techniques including Nesterov's accelerated gradient descent, multi-consensus
and gradient-tracking. The analysis is new, and may be applied to other related
problems. Empirical studies demonstrate the effectiveness of our method for
machine learning applications
The sorted effects method: discovering heterogeneous effects beyond their averages
Supplemental Data & Programs are available here: https://hdl.handle.net/2144/34409The partial (ceteris paribus) effects of interest in nonlinear and interactive linear models are heterogeneous as they can vary dramatically with the underlying observed or unobserved covariates. Despite the apparent importance of heterogeneity, a common practice in modern empirical work is to largely ignore it by reporting average partial effects (or, at best, average effects for some groups). While average effects provide very convenient scalar summaries of typical effects, by definition they fail to reflect the entire variety of the heterogeneous effects. In order to discover these effects much more fully, we propose to estimate and report sorted effects -- a collection of estimated partial effects sorted in increasing order and indexed by percentiles. By construction the sorted effect curves completely represent and help visualize the range of the heterogeneous effects in one plot. They are as convenient and easy to report in practice as the conventional average partial effects. They also serve as a basis for classification analysis, where we divide the observational units into most or least affected groups and summarize their characteristics. We provide a quantification of uncertainty (standard errors and confidence bands) for the estimated sorted effects and related classification analysis, and provide confidence sets for the most and least affected groups. The derived statistical results rely on establishing key, new mathematical results on Hadamard differentiability of a multivariate sorting operator and a related classification operator, which are of independent interest. We apply the sorted effects method and classification analysis to demonstrate several striking patterns in the gender wage gap.https://arxiv.org/abs/1512.05635Accepted manuscrip
- …