37,003 research outputs found

    Bandwidth selection in kernel empirical risk minimization via the gradient

    Get PDF
    In this paper, we deal with the data-driven selection of multidimensional and possibly anisotropic bandwidths in the general framework of kernel empirical risk minimization. We propose a universal selection rule, which leads to optimal adaptive results in a large variety of statistical models such as nonparametric robust regression and statistical learning with errors in variables. These results are stated in the context of smooth loss functions, where the gradient of the risk appears as a good criterion to measure the performance of our estimators. The selection rule consists of a comparison of gradient empirical risks. It can be viewed as a nontrivial improvement of the so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one main advantage of our selection rule is the nondependency on the Hessian matrix of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Calibration of One-Class SVM for MV set estimation

    Full text link
    A general approach for anomaly detection or novelty detection consists in estimating high density regions or Minimum Volume (MV) sets. The One-Class Support Vector Machine (OCSVM) is a state-of-the-art algorithm for estimating such regions from high dimensional data. Yet it suffers from practical limitations. When applied to a limited number of samples it can lead to poor performance even when picking the best hyperparameters. Moreover the solution of OCSVM is very sensitive to the selection of hyperparameters which makes it hard to optimize in an unsupervised setting. We present a new approach to estimate MV sets using the OCSVM with a different choice of the parameter controlling the proportion of outliers. The solution function of the OCSVM is learnt on a training set and the desired probability mass is obtained by adjusting the offset on a test set to prevent overfitting. Models learnt on different train/test splits are then aggregated to reduce the variance induced by such random splits. Our approach makes it possible to tune the hyperparameters automatically and obtain nested set estimates. Experimental results show that our approach outperforms the standard OCSVM formulation while suffering less from the curse of dimensionality than kernel density estimates. Results on actual data sets are also presented.Comment: IEEE DSAA' 2015, Oct 2015, Paris, Franc

    Economic Optimization of Fiber Optic Network Design in Anchorage

    Get PDF
    Presented to the Faculty of the University of Alaska Anchorage in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE, ENGINEERING MANAGEMENTThe wireline telecommunications industry is currently involved in an evolution. Growing bandwidth demands are putting pressure on the capabilities of outdated copper based networks. These demands are being meet by replacing these copper based networks with fiber optic networks. Unfortunately, telecommunications decision makers are tasked with figuring out how best to deploy these networks with little ability to plan, organize, lead, or control these large projects. This project introduces a novel approach to designing fiber optic access networks. By leveraging well known clustering and routing techniques to produce sound network design, decision makers will better understand how to divide service areas, where to place fiber, and how much fiber should be placed. Combining this output with other typical measures of costs and revenue, the decision maker will also be able to focus on the business areas that will provide the best outcome when undertaking this transformational evolution of physical networks.Introduction / Background / Clustering, Routing, and the Model / Results and Analysis / Conclusion / Reference

    On methods to determine bounds on the Q-factor for a given directivity

    Full text link
    This paper revisit and extend the interesting case of bounds on the Q-factor for a given directivity for a small antenna of arbitrary shape. A higher directivity in a small antenna is closely connected with a narrow impedance bandwidth. The relation between bandwidth and a desired directivity is still not fully understood, not even for small antennas. Initial investigations in this direction has related the radius of a circumscribing sphere to the directivity, and bounds on the Q-factor has also been derived for a partial directivity in a given direction. In this paper we derive lower bounds on the Q-factor for a total desired directivity for an arbitrarily shaped antenna in a given direction as a convex problem using semi-definite relaxation techniques (SDR). We also show that the relaxed solution is also a solution of the original problem of determining the lower Q-factor bound for a total desired directivity. SDR can also be used to relax a class of other interesting non-convex constraints in antenna optimization such as tuning, losses, front-to-back ratio. We compare two different new methods to determine the lowest Q-factor for arbitrary shaped antennas for a given total directivity. We also compare our results with full EM-simulations of a parasitic element antenna with high directivity.Comment: Correct some minor typos in the previous versio

    Estimator selection: a new method with applications to kernel density estimation

    Get PDF
    Estimator selection has become a crucial issue in non parametric estimation. Two widely used methods are penalized empirical risk minimization (such as penalized log-likelihood estimation) or pairwise comparison (such as Lepski's method). Our aim in this paper is twofold. First we explain some general ideas about the calibration issue of estimator selection methods. We review some known results, putting the emphasis on the concept of minimal penalty which is helpful to design data-driven selection criteria. Secondly we present a new method for bandwidth selection within the framework of kernel density density estimation which is in some sense intermediate between these two main methods mentioned above. We provide some theoretical results which lead to some fully data-driven selection strategy

    Multi-Path Alpha-Fair Resource Allocation at Scale in Distributed Software Defined Networks

    Get PDF
    The performance of computer networks relies on how bandwidth is shared among different flows. Fair resource allocation is a challenging problem particularly when the flows evolve over time. To address this issue, bandwidth sharing techniques that quickly react to the traffic fluctuations are of interest, especially in large scale settings with hundreds of nodes and thousands of flows. In this context, we propose a distributed algorithm based on the Alternating Direction Method of Multipliers (ADMM) that tackles the multi-path fair resource allocation problem in a distributed SDN control architecture. Our ADMM-based algorithm continuously generates a sequence of resource allocation solutions converging to the fair allocation while always remaining feasible, a property that standard primal-dual decomposition methods often lack. Thanks to the distribution of all computer intensive operations, we demonstrate that we can handle large instances at scale

    The algorithm of noisy k-means

    Get PDF
    In this note, we introduce a new algorithm to deal with finite dimensional clustering with errors in variables. The design of this algorithm is based on recent theoretical advances (see Loustau (2013a,b)) in statistical learning with errors in variables. As the previous mentioned papers, the algorithm mixes different tools from the inverse problem literature and the machine learning community. Coarsely, it is based on a two-step procedure: (1) a deconvolution step to deal with noisy inputs and (2) Newton's iterations as the popular k-means