33 research outputs found

    Density of States for a Specified Correlation Function and the Energy Landscape

    Full text link
    The degeneracy of two-phase disordered microstructures consistent with a specified correlation function is analyzed by mapping it to a ground-state degeneracy. We determine for the first time the associated density of states via a Monte Carlo algorithm. Our results are described in terms of the roughness of the energy landscape, defined on a hypercubic configuration space. The use of a Hamming distance in this space enables us to define a roughness metric, which is calculated from the correlation function alone and related quantitatively to the structural degeneracy. This relation is validated for a wide variety of disordered systems.Comment: Accepted for publication in Physical Review Letter

    Flexible constrained sampling with guarantees for pattern mining

    Get PDF
    Pattern sampling has been proposed as a potential solution to the infamous pattern explosion. Instead of enumerating all patterns that satisfy the constraints, individual patterns are sampled proportional to a given quality measure. Several sampling algorithms have been proposed, but each of them has its limitations when it comes to 1) flexibility in terms of quality measures and constraints that can be used, and/or 2) guarantees with respect to sampling accuracy. We therefore present Flexics, the first flexible pattern sampler that supports a broad class of quality measures and constraints, while providing strong guarantees regarding sampling accuracy. To achieve this, we leverage the perspective on pattern mining as a constraint satisfaction problem and build upon the latest advances in sampling solutions in SAT as well as existing pattern mining algorithms. Furthermore, the proposed algorithm is applicable to a variety of pattern languages, which allows us to introduce and tackle the novel task of sampling sets of patterns. We introduce and empirically evaluate two variants of Flexics: 1) a generic variant that addresses the well-known itemset sampling task and the novel pattern set sampling task as well as a wide range of expressive constraints within these tasks, and 2) a specialized variant that exploits existing frequent itemset techniques to achieve substantial speed-ups. Experiments show that Flexics is both accurate and efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal (ECML/PKDD 2017 journal track

    Neural Network Compression for Noisy Storage Devices

    Full text link
    Compression and efficient storage of neural network (NN) parameters is critical for applications that run on resource-constrained devices. Although NN model compression has made significant progress, there has been considerably less investigation in the actual physical storage of NN parameters. Conventionally, model compression and physical storage are decoupled, as digital storage media with error correcting codes (ECCs) provide robust error-free storage. This decoupled approach is inefficient, as it forces the storage to treat each bit of the compressed model equally, and to dedicate the same amount of resources to each bit. We propose a radically different approach that: (i) employs analog memories to maximize the capacity of each memory cell, and (ii) jointly optimizes model compression and physical storage to maximize memory utility. We investigate the challenges of analog storage by studying model storage on phase change memory (PCM) arrays and develop a variety of robust coding strategies for NN model storage. We demonstrate the efficacy of our approach on MNIST, CIFAR-10 and ImageNet datasets for both existing and novel compression methods. Compared to conventional error-free digital storage, our method has the potential to reduce the memory size by one order of magnitude, without significantly compromising the stored model's accuracy.Comment: 19 pages, 9 figure

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Get PDF
    Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG- bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood develop- ment, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google- internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting

    Trust estimation in autonomic networks: a statistical mechanics approach

    No full text

    Stencil Autotuning with Ordinal Regression: Extended Abstract

    No full text

    Challenges in KDD and ML for sustainable development

    No full text
    ACM SIGKDD Conference on Knowledge Discovery and Data Mining, [En ligne] Singapour, SGP, 14-/08/2021 - 18/08/2021Artificial Intelligence and machine learning techniques can offer powerful tools for addressing the greatest challenges facing humanity and helping society adapt to a rapidly changing climate, respond to disasters and pandemic crisis, and reach the United Nations (UN) Sustainable Development Goals (SDGs) by 2030. In recent approaches for mitigation and adaptation, data analytics and ML are only one part of the solution that requires interdisciplinary and methodological research and innovations. For example, challenges include multi-modal and multi-source data fusion to combine satellite imagery with other relevant data, handling noisy and missing ground data at various spatio-temporal scales, and ensembling multiple physical and ML models to improve prediction accuracy. Despite recognized successes, there are many areas where ML is not applicable, performs poorly or gives insights that are not actionable. This tutorial will survey the recent and significant contributions in KDD and ML for sustainable development and will highlight current challenges that need to be addressed to transform and equip engaged sustainability science with robust ML-based tools to support actionable decision-making for a more sustainable future
    corecore