1,528 research outputs found

    Decontamination of Mutual Contamination Models

    Full text link
    Many machine learning problems can be characterized by mutual contamination models. In these problems, one observes several random samples from different convex combinations of a set of unknown base distributions and the goal is to infer these base distributions. This paper considers the general setting where the base distributions are defined on arbitrary probability spaces. We examine three popular machine learning problems that arise in this general setting: multiclass classification with label noise, demixing of mixed membership models, and classification with partial labels. In each case, we give sufficient conditions for identifiability and present algorithms for the infinite and finite sample settings, with associated performance guarantees.Comment: Published in JMLR. Subsumes arXiv:1602.0623

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in Data Science. The group consisted of 25 undergraduate faculty from a variety of institutions in the U.S., primarily from the disciplines of mathematics, statistics and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in Data Science

    The LifeV library: engineering mathematics beyond the proof of concept

    Get PDF
    LifeV is a library for the finite element (FE) solution of partial differential equations in one, two, and three dimensions. It is written in C++ and designed to run on diverse parallel architectures, including cloud and high performance computing facilities. In spite of its academic research nature, meaning a library for the development and testing of new methods, one distinguishing feature of LifeV is its use on real world problems and it is intended to provide a tool for many engineering applications. It has been actually used in computational hemodynamics, including cardiac mechanics and fluid-structure interaction problems, in porous media, ice sheets dynamics for both forward and inverse problems. In this paper we give a short overview of the features of LifeV and its coding paradigms on simple problems. The main focus is on the parallel environment which is mainly driven by domain decomposition methods and based on external libraries such as MPI, the Trilinos project, HDF5 and ParMetis. Dedicated to the memory of Fausto Saleri.Comment: Review of the LifeV Finite Element librar

    Domain decomposition methods for the parallel computation of reacting flows

    Get PDF
    Domain decomposition is a natural route to parallel computing for partial differential equation solvers. Subdomains of which the original domain of definition is comprised are assigned to independent processors at the price of periodic coordination between processors to compute global parameters and maintain the requisite degree of continuity of the solution at the subdomain interfaces. In the domain-decomposed solution of steady multidimensional systems of PDEs by finite difference methods using a pseudo-transient version of Newton iteration, the only portion of the computation which generally stands in the way of efficient parallelization is the solution of the large, sparse linear systems arising at each Newton step. For some Jacobian matrices drawn from an actual two-dimensional reacting flow problem, comparisons are made between relaxation-based linear solvers and also preconditioned iterative methods of Conjugate Gradient and Chebyshev type, focusing attention on both iteration count and global inner product count. The generalized minimum residual method with block-ILU preconditioning is judged the best serial method among those considered, and parallel numerical experiments on the Encore Multimax demonstrate for it approximately 10-fold speedup on 16 processors
    • …
    corecore