6,326 research outputs found

    Sharp Bounds for Generalized Uniformity Testing

    Full text link
    We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution pp over an {\em unknown} discrete domain Ω\mathbf{\Omega}, we want to distinguish, with probability at least 2/32/3, between the case that pp is uniform on some {\em subset} of Ω\mathbf{\Omega} versus ϵ\epsilon-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is Θ(1/(ϵ4/3p3)+1/(ϵ2p2))\Theta\left(1/(\epsilon^{4/3}\|p\|_3) + 1/(\epsilon^{2} \|p\|_2) \right)

    A summary of the endemic beetle genera of the West Indies (Insecta: Coleoptera); bioindicators of the evolutionary richness of this Neotropical archipelago

    Get PDF
    The Caribbean Islands (or the West Indies) are recognized as one of the leading global biodiversity hot spots. This is based on data on species, genus, and family diversity for vascular plants and non-marine vertebrates. This paper presents data on genus level endemicity for the most speciose (but less well publicised) group of terrestrial animals: the beetles, with 205 genera (in 25 families) now recognized as being endemic (restricted) to the West Indies. The predominant families with endemic genera are Cerambycidae (41), Chrysomelidae (28), Curculionidae (26), and Staphylinidae (25). This high level of beetle generic endemicity can be extrapolated to suggest that a total of about 700 genera of all insects could be endemic to the West Indies. This far surpasses the total of 269 endemic genera of all plants and non-marine vertebrates, and reinforces the biodiversity richness of the insect fauna of the West Indies.Las islas del Caribe (o Indias Occidentales) son reconocidas como uno de los principales hotspots de la biodiversidad global. Esto se basa en datos sobre la diversidad de especies, géneros y familias de plantas vasculares y vertebrados no-marinos. Este trabajo presenta datos sobre la endemicidad a nivel genérico para el más especioso (pero menos popularizado) grupo de animales terrestres: los escarabajos, con 205 géneros (en 25 familias) reconocidos al presente como endémicos (restringidos) a las Indias Occidentales. Las familias predominantes en géneros endémicos son Cerambycidae (41), Chrysomelidae (28), Curculionidae (26) y Staphylinidae (25). Este alto nivel de endemicidad genérica en los escarabajos puede extrapolarse a sugerir que alrededor de 700 géneros pudieran ser endémicos entre todos los insectos de las Indias Occidentales. Esto sobrepasa ampliamente el total de 269 géneros endémicos de plantas y vertebrados no-marinos y refuerza la riqueza en biodiversidad de la fauna de insectos en las Indias Occidentales

    List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians

    Full text link
    We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. {\bf List-Decodable Mean Estimation.} Fix any dZ+d \in \mathbb{Z}_+ and 0<α<1/20< \alpha <1/2. We design an algorithm with runtime O(poly(n/α)d)O (\mathrm{poly}(n/\alpha)^{d}) that outputs a list of O(1/α)O(1/\alpha) many candidate vectors such that with high probability one of the candidates is within 2\ell_2-distance O(α1/(2d))O(\alpha^{-1/(2d)}) from the true mean. The only previous algorithm for this problem achieved error O~(α1/2)\tilde O(\alpha^{-1/2}) under second moment conditions. For d=O(1/ϵ)d = O(1/\epsilon), our algorithm runs in polynomial time and achieves error O(αϵ)O(\alpha^{\epsilon}). We also give a Statistical Query lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. {\bf Learning Mixtures of Spherical Gaussians.} We give a learning algorithm for mixtures of spherical Gaussians that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform mixture of kk identity covariance Gaussians we obtain: For any ϵ>0\epsilon>0, if the pairwise separation between the means is at least Ω(kϵ+log(1/δ))\Omega(k^{\epsilon}+\sqrt{\log(1/\delta)}), our algorithm learns the unknown parameters within accuracy δ\delta with sample complexity and running time poly(n,1/δ,(k/ϵ)1/ϵ)\mathrm{poly} (n, 1/\delta, (k/\epsilon)^{1/\epsilon}). The previously best known polynomial time algorithm required separation at least k1/4polylog(k/δ)k^{1/4} \mathrm{polylog}(k/\delta). Our main technical contribution is a new technique, using degree-dd multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted

    Building Blocks for Subleading Helicity Operators

    Get PDF
    On-shell helicity methods provide powerful tools for determining scattering amplitudes, which have a one-to-one correspondence with leading power helicity operators in the Soft-Collinear Effective Theory (SCET) away from singular regions of phase space. We show that helicity based operators are also useful for enumerating power suppressed SCET operators, which encode subleading amplitude information about singular limits. In particular, we present a complete set of scalar helicity building blocks that are valid for constructing operators at any order in the SCET power expansion. We also describe an interesting angular momentum selection rule that restricts how these building blocks can be assembled.Comment: 22 pages without references, 2 figures v2. Updated minor typo in Table

    Robust Learning of Fixed-Structure Bayesian Networks

    Full text link
    We investigate the problem of learning Bayesian networks in a robust model where an ϵ\epsilon-fraction of the samples are adversarially corrupted. In this work, we study the fully observable discrete case where the structure of the network is given. Even in this basic setting, previous learning algorithms either run in exponential time or lose dimension-dependent factors in their error guarantees. We provide the first computationally efficient robust learning algorithm for this problem with dimension-independent error guarantees. Our algorithm has near-optimal sample complexity, runs in polynomial time, and achieves error that scales nearly-linearly with the fraction of adversarially corrupted samples. Finally, we show on both synthetic and semi-synthetic data that our algorithm performs well in practice

    Learning from experience : students in the international baccalaureate in natural science program are in Ecuador /

    Get PDF
    Publié comme vol. 23, no 3, spring 2010 de la revue Pédagogie collégiale