13,488 research outputs found

    Distributed multinomial regression

    Full text link
    This article introduces a model-based approach to distributed computing for multinomial logistic (softmax) regression. We treat counts for each response category as independent Poisson regressions via plug-in estimates for fixed effects shared across categories. The work is driven by the high-dimensional-response multinomial models that are used in analysis of a large number of random counts. Our motivating applications are in text analysis, where documents are tokenized and the token counts are modeled as arising from a multinomial dependent upon document attributes. We estimate such models for a publicly available data set of reviews from Yelp, with text regressed onto a large set of explanatory variables (user, business, and rating information). The fitted models serve as a basis for exploring the connection between words and variables of interest, for reducing dimension into supervised factor scores, and for prediction. We argue that the approach herein provides an attractive option for social scientists and other text analysts who wish to bring familiar regression tools to bear on text data.Comment: Published at http://dx.doi.org/10.1214/15-AOAS831 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A modular and interactive OLED-based lighting system

    Get PDF
    The concept of a flexible, large-area, organic light emitting diode (OLED)-based lighting system with a modular structure and built-in intelligent light management is introduced. Such a flexible, thin, portable lighting system with discreetly integrated electronics is important in order to allow the implementation of the lighting system into a variety of places, such as cars and temporary expedition areas. A modular construction of an OLED lighting panel makes it possible to control each OLED cell individually. This not only enables us to counteract aging or degradation effects in the OLED cells but it also allows individual OLED module brightness control to support human or ambient interaction based on integrated or centralized sensors. Moreover, integrating the driving electronics in the backplane of an OLED module improves the energy efficiency of operating large OLED panels. The thin, modular construction and individual, dynamic control are successfully demonstrated

    How proofs are prepared at Camelot

    Full text link
    We study a design framework for robust, independently verifiable, and workload-balanced distributed algorithms working on a common input. An algorithm based on the framework is essentially a distributed encoding procedure for a Reed--Solomon code, which enables (a) robustness against byzantine failures with intrinsic error-correction and identification of failed nodes, and (b) independent randomized verification to check the entire computation for correctness, which takes essentially no more resources than each node individually contributes to the computation. The framework builds on recent Merlin--Arthur proofs of batch evaluation of Williams~[{\em Electron.\ Colloq.\ Comput.\ Complexity}, Report TR16-002, January 2016] with the observation that {\em Merlin's magic is not needed} for batch evaluation---mere Knights can prepare the proof, in parallel, and with intrinsic error-correction. The contribution of this paper is to show that in many cases the verifiable batch evaluation framework admits algorithms that match in total resource consumption the best known sequential algorithm for solving the problem. As our main result, we show that the kk-cliques in an nn-vertex graph can be counted {\em and} verified in per-node O(n(ω+ϵ)k/6)O(n^{(\omega+\epsilon)k/6}) time and space on O(n(ω+ϵ)k/6)O(n^{(\omega+\epsilon)k/6}) compute nodes, for any constant ϵ>0\epsilon>0 and positive integer kk divisible by 66, where 2ω<2.37286392\leq\omega<2.3728639 is the exponent of matrix multiplication. This matches in total running time the best known sequential algorithm, due to Ne{\v{s}}et{\v{r}}il and Poljak [{\em Comment.~Math.~Univ.~Carolin.}~26 (1985) 415--419], and considerably improves its space usage and parallelizability. Further results include novel algorithms for counting triangles in sparse graphs, computing the chromatic polynomial of a graph, and computing the Tutte polynomial of a graph.Comment: 42 p

    Early Accurate Results for Advanced Analytics on MapReduce

    Full text link
    Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

    Monte Carlo Radiative Transfer

    Get PDF
    I outline methods for calculating the solution of Monte Carlo Radiative Transfer (MCRT) in scattering, absorption and emission processes of dust and gas, including polarization. I provide a bibliography of relevant papers on methods with astrophysical applications.Comment: To appear in the Chandra Centennial issue of the Bulletin of the Astronomical Society of India, volume 39 (2011), eds D.J. Saikia and Virginia Trimble; 27 pages, 1 figur
    corecore