    Four lectures on probabilistic methods for data science

    Methods of high-dimensional probability play a central role in applications for statistics, signal processing theoretical computer science and related fields. These lectures present a sample of particularly useful tools of high-dimensional probability, focusing on the classical and matrix Bernstein's inequality and the uniform matrix deviation inequality. We illustrate these tools with applications for dimension reduction, network analysis, covariance estimation, matrix completion and sparse signal recovery. The lectures are geared towards beginning graduate students who have taken a rigorous course in probability but may not have any experience in data science applications.Comment: Lectures given at 2016 PCMI Graduate Summer School in Mathematics of Data. Some typos, inaccuracies fixe

    09111 Abstracts Collection -- Computational Geometry

    From March 8 to March 13, 2009, the Dagstuhl Seminar 09111 ``Computational Geometry \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Combinatorial and Additive Number Theory Problem Sessions: '09--'19

    These notes are a summary of the problem session discussions at various CANT (Combinatorial and Additive Number Theory Conferences). Currently they include all years from 2009 through 2019 (inclusive); the goal is to supplement this file each year. These additions will include the problem session notes from that year, and occasionally discussions on progress on previous problems. If you are interested in pursuing any of these problems and want additional information as to progress, please email the author. See http://www.theoryofnumbers.com/ for the conference homepage.Comment: Version 3.4, 58 pages, 2 figures added 2019 problems on 5/31/2019, fixed a few issues from some presenters 6/29/201

    Mod-phi convergence I: Normality zones and precise deviations

    In this paper, we use the framework of mod-ϕ\phi convergence to prove precise large or moderate deviations for quite general sequences of real valued random variables (Xn)nN(X_{n})_{n \in \mathbb{N}}, which can be lattice or non-lattice distributed. We establish precise estimates of the fluctuations P[XntnB]P[X_{n} \in t_{n}B], instead of the usual estimates for the rate of exponential decay log(P[XntnB])\log( P[X_{n}\in t_{n}B]). Our approach provides us with a systematic way to characterise the normality zone, that is the zone in which the Gaussian approximation for the tails is still valid. Besides, the residue function measures the extent to which this approximation fails to hold at the edge of the normality zone. The first sections of the article are devoted to a proof of these abstract results and comparisons with existing results. We then propose new examples covered by this theory and coming from various areas of mathematics: classical probability theory, number theory (statistics of additive arithmetic functions), combinatorics (statistics of random permutations), random matrix theory (characteristic polynomials of random matrices in compact Lie groups), graph theory (number of subgraphs in a random Erd\H{o}s-R\'enyi graph), and non-commutative probability theory (asymptotics of random character values of symmetric groups). In particular, we complete our theory of precise deviations by a concrete method of cumulants and dependency graphs, which applies to many examples of sums of "weakly dependent" random variables. The large number as well as the variety of examples hint at a universality class for second order fluctuations.Comment: 103 pages. New (final) version: multiple small improvements ; a new section on mod-Gaussian convergence coming from the factorization of the generating function ; the multi-dimensional results have been moved to a forthcoming paper ; and the introduction has been reworke

    Efficient algorithms for optimization problems involving semi-algebraic range searching

    We present a general technique, based on parametric search with some twist, for solving a variety of optimization problems on a set of semi-algebraic geometric objects of constant complexity. The common feature of these problems is that they involve a `growth parameter' rr and a semi-algebraic predicate Π(o,o;r)\Pi(o,o';r) of constant complexity on pairs of input objects, which depends on rr and is monotone in rr. One then defines a graph G(r)G(r) whose edges are all the pairs (o,o)(o,o') for which Π(o,o;r)\Pi(o,o';r) is true, and seeks the smallest value of rr for which some monotone property holds for G(r)G(r). Problems that fit into this context include (i) the reverse shortest path problem in unit-disk graphs, recently studied by Wang and Zhao, (ii) the same problem for weighted unit-disk graphs, with a decision procedure recently provided by Wang and Xue, (iii) extensions of these problems to three and higher dimensions, (iv) the discrete Fr\'echet distance with one-sided shortcuts in higher dimensions, extending the study by Ben Avraham et al., (v) perfect matchings in intersection graphs: given, e.g., a set of fat ellipses of roughly the same size, find the smallest value rr such that if we expand each of the ellipses by rr, the resulting intersection graph contains a perfect matching, (vi) generalized distance selection problems: given, e.g., a set of disjoint segments, find the kk'th smallest distance among the pairwise distances determined by the segments, for a given (sufficiently small but superlinear) parameter kk, and (vii) the maximum-height independent towers problem, in which we want to erect vertical towers of maximum height over a 1.5-dimensional terrain so that no pair of tower tips are mutually visible. We obtain significantly improved solutions for problems (i), (ii) and (vi), and new efficient solutions to the other problems.Comment: Significantly generalized and with additional applications. Notice the change in titl

    Synchronization in Complex Networks Under Uncertainty

    La sincronització en xarxes és la música dels sistemes complexes. Els ritmes col·lectius que emergeixen de molts oscil·ladors acoblats expliquen el batec constant del cor, els patrons recurrents d'activitat neuronal i la sincronia descentralitzada a les xarxes elèctriques. Els models matemàtics són sòlids i han avançat significativament, especialment en el problema del camp mitjà, on tots els oscil·ladors estan connectats mútuament. Tanmateix, les xarxes reals tenen interaccions complexes que dificulten el tractament analític. Falta un marc general i les soluciones existents en caixes negres numèriques i espectrals dificulten la interpretació. A més, la informació obtinguda en mesures empíriques sol ser incompleta. Motivats per aquestes limitacions, en aquesta tesi proposem un estudi teòric dels oscil·ladors acoblats en xarxes sota incertesa. Apliquem propagació d'errors per predir com una estructura complexa amplifica el soroll des dels pesos microscòpics fins al punt crític de sincronització, estudiem l'efecte d'equilibrar les interaccions de parelles i d'ordre superior en l'optimització de la sincronia i derivem esquemes d'ajust de pesos per mapejar el comportament de sincronització en xarxes diferents. A més, un desplegament geomètric rigorós de l'estat sincronitzat ens permet abordar escenaris descentralitzats i descobrir regles locals òptimes que indueixen transicions globals abruptes. Finalment, suggerim dreceres espectrals per predir punts crítics amb àlgebra lineal i representacions aproximades de xarxa. En general, proporcionem eines analítiques per tractar les xarxes d'oscil·ladors en condicions sorolloses i demostrem que darrere els supòsits predominants d'informació completa s'amaguen explicacions mecanicistes clares. Troballes rellevants inclouen xarxes particulars que maximitzen el ventall de comportaments i el desplegament exitós del binomi estructura-dinàmica des d'una perspectiva local. Aquesta tesi avança la recerca d'una teoria general de la sincronització en xarxes a partir de principis mecanicistes i geomètrics, una peça clau que manca en l'anàlisi, disseny i control de xarxes neuronals biològiques i artificials i sistemes d'enginyeria complexos.La sincronización en redes es la música de los sistemas complejos. Los ritmos colectivos que emergen de muchos osciladores acoplados explican el latido constante del corazón, los patrones recurrentes de actividad neuronal y la sincronía descentralizada de las redes eléctricas. Los modelos matemáticos son sólidos y han avanzado significativamente, especialmente en el problema del campo medio, donde todos los osciladores están conectados entre sí. Sin embargo, las redes reales tienen interacciones complejas que dificultan el tratamiento analítico. Falta un marco general y las soluciones en cajas negras numéricas y espectrales dificultan la interpretación. Además, las mediciones empíricas suelen ser incompletas. Motivados por estas limitaciones, en esta tesis proponemos un estudio teórico de osciladores acoplados en redes bajo incertidumbre. Aplicamos propagación de errores para predecir cómo una estructura compleja amplifica el ruido desde las conexiones microscópicas hasta puntos críticos macroscópicos, estudiamos el efecto de equilibrar interacciones por pares y de orden superior en la optimización de la sincronía y derivamos esquemas de ajuste de pesos para mapear el comportamiento en estructuras distintas. Una expansión geométrica del estado sincronizado nos permite abordar escenarios descentralizados y descubrir reglas locales que inducen transiciones abruptas globales. Por último, sugerimos atajos espectrales para predecir puntos críticos usando álgebra lineal y representaciones aproximadas de red. En general, proporcionamos herramientas analíticas para manejar redes de osciladores en condiciones ruidosas y demostramos que detrás de las suposiciones predominantes de información completa se ocultaban claras explicaciones mecanicistas. Hallazgos relevantes incluyen redes particulares que maximizan el rango de comportamientos y la explicación del binomio estructura-dinámica desde una perspectiva local. Esta tesis avanza en la búsqueda de una teoría general de sincronización en redes desde principios mecánicos y geométricos, una pieza clave que falta en el análisis, diseño y control de redes neuronales biológicas y artificiales y sistemas de ingeniería complejos.Synchronization in networks is the music of complex systems. Collective rhythms emerging from many interacting oscillators appear across all scales of nature, from the steady heartbeat and the recurrent patterns in neuronal activity to the decentralized synchrony in power-grids. The mathematics behind these processes are solid and have significantly advanced lately, especially in the mean-field problem, where oscillators are all mutually connected. However, real networks have complex interactions that difficult the analytical treatment. A general framework is missing and most existing results rely on numerical and spectral black-boxes that hinder interpretation. Also, the information obtained from measurements is usually incomplete. Motivated by these limitations, in this thesis we propose a theoretical study of network-coupled oscillators under uncertainty. We apply error propagation to predict how a complex structure amplifies noise from the link weights to the synchronization onset, study the effect of balancing pair-wise and higher-order interactions in synchrony optimization, and derive weight-tuning schemes to map the synchronization behavior of different structures. Also, we develop a rigorous geometric unfolding of the synchronized state to tackle decentralized scenarios and to discover optimal local rules that induce global abrupt transitions. Last, we suggest spectral shortcuts to predict critical points using linear algebra and network representations with limited information. Overall, we provide analytical tools to deal with oscillator networks under noisy conditions and prove that mechanistic explanations were hidden behind the prevalent assumptions of complete information. Relevant finding include particular networks that maximize the range of behaviors and the successful unfolding of the structure-dynamics interplay from a local perspective. This thesis advances the quest of a general theory of network synchronization built from mechanistic and geometric principles, a key missing piece in the analysis, design and control of biological and artificial neural networks and complex engineering systems

    A new conceptual approach for systematic error correction in CNC machine tools minimizing worst case prediction error

    A new artifact-based method to identify the systematic errors in multi-axis CNC machine tools minimizing the worst case prediction error is presented. The closed loop volumetric error is identified by simultaneously moving the axes of the machine tool. The physical artifact is manufactured on the machine tool and later measured on a coordinate measuring machine. The artifact consists of a set of holes in the machine tool workspace at locations that minimize the worst case prediction error for a given bounded measurement error. The number of holes to be drilled depends on the degree of the polynomials used to model the systematic error and the number of axes of the machine tool. The prediction error is also function of the number and location of the holes. The feasibility of the method is first investigated for a two-axis machine to find the best experimental setting. Finally based on the two-axis case study, we extend the results to machine tools with any number of axes. The obtained results are very promising and require only a short time to produce the artifac