2,497 research outputs found
On statistics, computation and scalability
How should statistical procedures be designed so as to be scalable
computationally to the massive datasets that are increasingly the norm? When
coupled with the requirement that an answer to an inferential question be
delivered within a certain time budget, this question has significant
repercussions for the field of statistics. With the goal of identifying
"time-data tradeoffs," we investigate some of the statistical consequences of
computational perspectives on scability, in particular divide-and-conquer
methodology and hierarchies of convex relaxations.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP17 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Receptor uptake arrays for vitamin B12, siderophores and glycans shape bacterial communities
Molecular variants of vitamin B12, siderophores and glycans occur. To take up
variant forms, bacteria may express an array of receptors. The gut microbe
Bacteroides thetaiotaomicron has three different receptors to take up variants
of vitamin B12 and 88 receptors to take up various glycans. The design of
receptor arrays reflects key processes that shape cellular evolution.
Competition may focus each species on a subset of the available nutrient
diversity. Some gut bacteria can take up only a narrow range of carbohydrates,
whereas species such as B.~thetaiotaomicron can digest many different complex
glycans. Comparison of different nutrients, habitats, and genomes provide
opportunity to test hypotheses about the breadth of receptor arrays. Another
important process concerns fluctuations in nutrient availability. Such
fluctuations enhance the value of cellular sensors, which gain information
about environmental availability and adjust receptor deployment. Bacteria often
adjust receptor expression in response to fluctuations of particular
carbohydrate food sources. Some species may adjust expression of uptake
receptors for specific siderophores. How do cells use sensor information to
control the response to fluctuations? That question about regulatory wiring
relates to problems that arise in control theory and artificial intelligence.
Control theory clarifies how to analyze environmental fluctuations in relation
to the design of sensors and response systems. Recent advances in deep learning
studies of artificial intelligence focus on the architecture of regulatory
wiring and the ways in which complex control networks represent and classify
environmental states. I emphasize the similar design problems that arise in
cellular evolution, control theory, and artificial intelligence. I connect
those broad concepts to testable hypotheses for bacterial uptake of B12,
siderophores and glycans.Comment: Added many new references, edited throughou
Universality caused: the case of renormalization group explanation
Recently, many have argued that there are certain kinds of abstract mathematical explanations that are noncausal. In particular, the irrelevancy approach suggests that abstracting away irrelevant causal details can leave us with a noncausal explanation. In this paper, I argue that the common example of Renormalization Group explanations of universality used to motivate the irrelevancy approach deserves more critical attention. I argue that the reasons given by those who hold up RG as noncausal do not stand up to critical scrutiny. As a result, the irrelevancy approach and the line between casual and noncausal explanation deserves more scrutiny
Support Recovery of Sparse Signals
We consider the problem of exact support recovery of sparse signals via noisy
measurements. The main focus is the sufficient and necessary conditions on the
number of measurements for support recovery to be reliable. By drawing an
analogy between the problem of support recovery and the problem of channel
coding over the Gaussian multiple access channel, and exploiting mathematical
tools developed for the latter problem, we obtain an information theoretic
framework for analyzing the performance limits of support recovery. Sharp
sufficient and necessary conditions on the number of measurements in terms of
the signal sparsity level and the measurement noise level are derived.
Specifically, when the number of nonzero entries is held fixed, the exact
asymptotics on the number of measurements for support recovery is developed.
When the number of nonzero entries increases in certain manners, we obtain
sufficient conditions tighter than existing results. In addition, we show that
the proposed methodology can deal with a variety of models of sparse signal
recovery, hence demonstrating its potential as an effective analytical tool.Comment: 33 page
Computational and statistical tradeoffs via convex relaxation
Modern massive datasets create a fundamental problem at the intersection of the computational and statistical sciences: how to provide guarantees on the quality of statistical inference given bounds on computational resources, such as time or space. Our approach to this problem is to define a notion of “algorithmic weakening,” in which a hierarchy of algorithms is ordered by both computational efficiency and statistical efficiency, allowing the growing strength of the data at scale to be traded off against the need for sophisticated processing. We illustrate this approach in the setting of denoising problems, using convex relaxation as the core inferential tool. Hierarchies of convex relaxations have been widely used in theoretical computer science to yield tractable approximation algorithms to many computationally intractable tasks. In the current paper, we show how to endow such hierarchies with a statistical characterization and thereby obtain concrete tradeoffs relating algorithmic runtime to amount of data
MLPerf Inference Benchmark
Machine-learning (ML) hardware and software system demand is burgeoning.
Driven by ML applications, the number of different ML inference systems has
exploded. Over 100 organizations are building ML inference chips, and the
systems that incorporate existing models span at least three orders of
magnitude in power consumption and five orders of magnitude in performance;
they range from embedded devices to data-center solutions. Fueling the hardware
are a dozen or more software frameworks and libraries. The myriad combinations
of ML hardware and ML software make assessing ML-system performance in an
architecture-neutral, representative, and reproducible manner challenging.
There is a clear need for industry-wide standard ML benchmarking and evaluation
criteria. MLPerf Inference answers that call. In this paper, we present our
benchmarking method for evaluating ML inference systems. Driven by more than 30
organizations as well as more than 200 ML engineers and practitioners, MLPerf
prescribes a set of rules and best practices to ensure comparability across
systems with wildly differing architectures. The first call for submissions
garnered more than 600 reproducible inference-performance measurements from 14
organizations, representing over 30 systems that showcase a wide range of
capabilities. The submissions attest to the benchmark's flexibility and
adaptability.Comment: ISCA 202
- …