37,003 research outputs found
Bandwidth selection in kernel empirical risk minimization via the gradient
In this paper, we deal with the data-driven selection of multidimensional and
possibly anisotropic bandwidths in the general framework of kernel empirical
risk minimization. We propose a universal selection rule, which leads to
optimal adaptive results in a large variety of statistical models such as
nonparametric robust regression and statistical learning with errors in
variables. These results are stated in the context of smooth loss functions,
where the gradient of the risk appears as a good criterion to measure the
performance of our estimators. The selection rule consists of a comparison of
gradient empirical risks. It can be viewed as a nontrivial improvement of the
so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one
main advantage of our selection rule is the nondependency on the Hessian matrix
of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Calibration of One-Class SVM for MV set estimation
A general approach for anomaly detection or novelty detection consists in
estimating high density regions or Minimum Volume (MV) sets. The One-Class
Support Vector Machine (OCSVM) is a state-of-the-art algorithm for estimating
such regions from high dimensional data. Yet it suffers from practical
limitations. When applied to a limited number of samples it can lead to poor
performance even when picking the best hyperparameters. Moreover the solution
of OCSVM is very sensitive to the selection of hyperparameters which makes it
hard to optimize in an unsupervised setting. We present a new approach to
estimate MV sets using the OCSVM with a different choice of the parameter
controlling the proportion of outliers. The solution function of the OCSVM is
learnt on a training set and the desired probability mass is obtained by
adjusting the offset on a test set to prevent overfitting. Models learnt on
different train/test splits are then aggregated to reduce the variance induced
by such random splits. Our approach makes it possible to tune the
hyperparameters automatically and obtain nested set estimates. Experimental
results show that our approach outperforms the standard OCSVM formulation while
suffering less from the curse of dimensionality than kernel density estimates.
Results on actual data sets are also presented.Comment: IEEE DSAA' 2015, Oct 2015, Paris, Franc
Economic Optimization of Fiber Optic Network Design in Anchorage
Presented to the Faculty of the University of Alaska Anchorage
in Partial Fulfillment of the Requirements for the Degree of
MASTER OF SCIENCE, ENGINEERING MANAGEMENTThe wireline telecommunications industry is currently involved in an evolution. Growing bandwidth demands are putting pressure on the capabilities of outdated copper based networks. These demands are being meet by replacing these copper based networks with fiber optic networks. Unfortunately, telecommunications decision makers are tasked with figuring out how best to deploy these networks with little ability to plan, organize, lead, or control these large projects.
This project introduces a novel approach to designing fiber optic access networks. By leveraging well known clustering and routing techniques to produce sound network design, decision makers will better understand how to divide service areas, where to place fiber, and how much fiber should be placed. Combining this output with other typical measures of costs and revenue, the decision maker will also be able to focus on the business areas that will provide the best outcome when undertaking this transformational evolution of physical networks.Introduction / Background / Clustering, Routing, and the Model / Results and Analysis / Conclusion / Reference
On methods to determine bounds on the Q-factor for a given directivity
This paper revisit and extend the interesting case of bounds on the Q-factor
for a given directivity for a small antenna of arbitrary shape. A higher
directivity in a small antenna is closely connected with a narrow impedance
bandwidth. The relation between bandwidth and a desired directivity is still
not fully understood, not even for small antennas. Initial investigations in
this direction has related the radius of a circumscribing sphere to the
directivity, and bounds on the Q-factor has also been derived for a partial
directivity in a given direction. In this paper we derive lower bounds on the
Q-factor for a total desired directivity for an arbitrarily shaped antenna in a
given direction as a convex problem using semi-definite relaxation techniques
(SDR). We also show that the relaxed solution is also a solution of the
original problem of determining the lower Q-factor bound for a total desired
directivity.
SDR can also be used to relax a class of other interesting non-convex
constraints in antenna optimization such as tuning, losses, front-to-back
ratio. We compare two different new methods to determine the lowest Q-factor
for arbitrary shaped antennas for a given total directivity. We also compare
our results with full EM-simulations of a parasitic element antenna with high
directivity.Comment: Correct some minor typos in the previous versio
Estimator selection: a new method with applications to kernel density estimation
Estimator selection has become a crucial issue in non parametric estimation.
Two widely used methods are penalized empirical risk minimization (such as
penalized log-likelihood estimation) or pairwise comparison (such as Lepski's
method). Our aim in this paper is twofold. First we explain some general ideas
about the calibration issue of estimator selection methods. We review some
known results, putting the emphasis on the concept of minimal penalty which is
helpful to design data-driven selection criteria. Secondly we present a new
method for bandwidth selection within the framework of kernel density density
estimation which is in some sense intermediate between these two main methods
mentioned above. We provide some theoretical results which lead to some fully
data-driven selection strategy
Multi-Path Alpha-Fair Resource Allocation at Scale in Distributed Software Defined Networks
The performance of computer networks relies on how bandwidth is shared among
different flows. Fair resource allocation is a challenging problem particularly
when the flows evolve over time. To address this issue, bandwidth sharing
techniques that quickly react to the traffic fluctuations are of interest,
especially in large scale settings with hundreds of nodes and thousands of
flows. In this context, we propose a distributed algorithm based on the
Alternating Direction Method of Multipliers (ADMM) that tackles the multi-path
fair resource allocation problem in a distributed SDN control architecture. Our
ADMM-based algorithm continuously generates a sequence of resource allocation
solutions converging to the fair allocation while always remaining feasible, a
property that standard primal-dual decomposition methods often lack. Thanks to
the distribution of all computer intensive operations, we demonstrate that we
can handle large instances at scale
The algorithm of noisy k-means
In this note, we introduce a new algorithm to deal with finite dimensional
clustering with errors in variables. The design of this algorithm is based on
recent theoretical advances (see Loustau (2013a,b)) in statistical learning
with errors in variables. As the previous mentioned papers, the algorithm mixes
different tools from the inverse problem literature and the machine learning
community. Coarsely, it is based on a two-step procedure: (1) a deconvolution
step to deal with noisy inputs and (2) Newton's iterations as the popular
k-means
- …