676 research outputs found
Basis Expansions for Functional Snippets
Estimation of mean and covariance functions is fundamental for functional
data analysis. While this topic has been studied extensively in the literature,
a key assumption is that there are enough data in the domain of interest to
estimate both the mean and covariance functions. In this paper, we investigate
mean and covariance estimation for functional snippets in which observations
from a subject are available only in an interval of length strictly (and often
much) shorter than the length of the whole interval of interest. For such a
sampling plan, no data is available for direct estimation of the off-diagonal
region of the covariance function. We tackle this challenge via a basis
representation of the covariance function. The proposed approach allows one to
consistently estimate an infinite-rank covariance function from functional
snippets. We establish the convergence rates for the proposed estimators and
illustrate their finite-sample performance via simulation studies and two data
applications.Comment: 51 pages, 10 figure
Logistic Regression and Classification with non-Euclidean Covariates
We introduce a logistic regression model for data pairs consisting of a
binary response and a covariate residing in a non-Euclidean metric space
without vector structures. Based on the proposed model we also develop a binary
classifier for non-Euclidean objects. We propose a maximum likelihood estimator
for the non-Euclidean regression coefficient in the model, and provide upper
bounds on the estimation error under various metric entropy conditions that
quantify complexity of the underlying metric space. Matching lower bounds are
derived for the important metric spaces commonly seen in statistics,
establishing optimality of the proposed estimator in such spaces. Similarly, an
upper bound on the excess risk of the developed classifier is provided for
general metric spaces. A finer upper bound and a matching lower bound, and thus
optimality of the proposed classifier, are established for Riemannian
manifolds. We investigate the numerical performance of the proposed estimator
and classifier via simulation studies, and illustrate their practical merits
via an application to task-related fMRI data.Comment: This revision contains the following updates: (1) The parameter space
is allowed to be unbounded; (2) Some upper bounds are tightene
Online Algorithms for Geographical Load Balancing
It has recently been proposed that Internet energy costs, both monetary and environmental, can be reduced by exploiting temporal variations and shifting processing to data centers located in regions where energy currently has low cost. Lightly loaded data centers can then turn off surplus servers. This paper studies online algorithms for determining the number of servers to leave on in each data center, and then uses these algorithms to study the environmental potential of geographical load balancing (GLB). A commonly suggested algorithm for this setting is “receding horizon control” (RHC), which computes the provisioning for the current time by optimizing over a window of predicted future loads. We show that RHC performs well in a homogeneous setting, in which all servers can serve all jobs equally well; however, we also prove that differences in propagation delays, servers, and electricity prices can cause RHC perform badly, So, we introduce variants of RHC that are guaranteed to perform as well in the face of such heterogeneity. These algorithms are then used to study the feasibility of powering a continent-wide set of data centers mostly by renewable sources, and to understand what portfolio of renewable energy is most effective
- …