40,000 research outputs found

    Online Local Learning via Semidefinite Programming

    Full text link
    In many online learning problems we are interested in predicting local information about some universe of items. For example, we may want to know whether two items are in the same cluster rather than computing an assignment of items to clusters; we may want to know which of two teams will win a game rather than computing a ranking of teams. Although finding the optimal clustering or ranking is typically intractable, it may be possible to predict the relationships between items as well as if you could solve the global optimization problem exactly. Formally, we consider an online learning problem in which a learner repeatedly guesses a pair of labels (l(x), l(y)) and receives an adversarial payoff depending on those labels. The learner's goal is to receive a payoff nearly as good as the best fixed labeling of the items. We show that a simple algorithm based on semidefinite programming can obtain asymptotically optimal regret in the case where the number of possible labels is O(1), resolving an open problem posed by Hazan, Kale, and Shalev-Schwartz. Our main technical contribution is a novel use and analysis of the log determinant regularizer, exploiting the observation that log det(A + I) upper bounds the entropy of any distribution with covariance matrix A.Comment: 10 page

    Estimation of the Rate-Distortion Function

    Full text link
    Motivated by questions in lossy data compression and by theoretical considerations, we examine the problem of estimating the rate-distortion function of an unknown (not necessarily discrete-valued) source from empirical data. Our focus is the behavior of the so-called "plug-in" estimator, which is simply the rate-distortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided to demonstrate that in certain cases it fails to converge to the true rate-distortion function. The analysis of its performance is complicated by the fact that the rate-distortion function is not continuous in the source distribution; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum likelihood estimators. General consistency results are given for the plug-in estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plug-in estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plug-in, and for estimating the optimal reproduction distribution.Comment: 18 pages, no figures [v2: removed an example with an error; corrected typos; a shortened version will appear in IEEE Trans. Inform. Theory

    An informational approach to the global optimization of expensive-to-evaluate functions

    Full text link
    In many global optimization problems motivated by engineering applications, the number of function evaluations is severely limited by time or cost. To ensure that each evaluation contributes to the localization of good candidates for the role of global minimizer, a sequential choice of evaluation points is usually carried out. In particular, when Kriging is used to interpolate past evaluations, the uncertainty associated with the lack of information on the function can be expressed and used to compute a number of criteria accounting for the interest of an additional evaluation at any given point. This paper introduces minimizer entropy as a new Kriging-based criterion for the sequential choice of points at which the function should be evaluated. Based on \emph{stepwise uncertainty reduction}, it accounts for the informational gain on the minimizer expected from a new evaluation. The criterion is approximated using conditional simulations of the Gaussian process model behind Kriging, and then inserted into an algorithm similar in spirit to the \emph{Efficient Global Optimization} (EGO) algorithm. An empirical comparison is carried out between our criterion and \emph{expected improvement}, one of the reference criteria in the literature. Experimental results indicate major evaluation savings over EGO. Finally, the method, which we call IAGO (for Informational Approach to Global Optimization) is extended to robust optimization problems, where both the factors to be tuned and the function evaluations are corrupted by noise.Comment: Accepted for publication in the Journal of Global Optimization (This is the revised version, with additional details on computational problems, and some grammatical changes

    A variational approach to path estimation and parameter inference of hidden diffusion processes

    Full text link
    We consider a hidden Markov model, where the signal process, given by a diffusion, is only indirectly observed through some noisy measurements. The article develops a variational method for approximating the hidden states of the signal process given the full set of observations. This, in particular, leads to systematic approximations of the smoothing densities of the signal process. The paper then demonstrates how an efficient inference scheme, based on this variational approach to the approximation of the hidden states, can be designed to estimate the unknown parameters of stochastic differential equations. Two examples at the end illustrate the efficacy and the accuracy of the presented method.Comment: 37 pages, 2 figures, revise
    • …
    corecore