4,194 research outputs found
Lattice QCD Thermodynamics on the Grid
We describe how we have used simultaneously nodes of the
EGEE Grid, accumulating ca. 300 CPU-years in 2-3 months, to determine an
important property of Quantum Chromodynamics. We explain how Grid resources
were exploited efficiently and with ease, using user-level overlay based on
Ganga and DIANE tools above standard Grid software stack. Application-specific
scheduling and resource selection based on simple but powerful heuristics
allowed to improve efficiency of the processing to obtain desired scientific
results by a specified deadline. This is also a demonstration of combined use
of supercomputers, to calculate the initial state of the QCD system, and Grids,
to perform the subsequent massively distributed simulations. The QCD simulation
was performed on a lattice. Keeping the strange quark mass at
its physical value, we reduced the masses of the up and down quarks until,
under an increase of temperature, the system underwent a second-order phase
transition to a quark-gluon plasma. Then we measured the response of this
system to an increase in the quark density. We find that the transition is
smoothened rather than sharpened. If confirmed on a finer lattice, this finding
makes it unlikely for ongoing experimental searches to find a QCD critical
point at small chemical potential
Patterns of Scalable Bayesian Inference
Datasets are growing not just in size but in complexity, creating a demand
for rich models and quantification of uncertainty. Bayesian methods are an
excellent fit for this demand, but scaling Bayesian inference is a challenge.
In response to this challenge, there has been considerable recent work based on
varying assumptions about model structure, underlying computational resources,
and the importance of asymptotic correctness. As a result, there is a zoo of
ideas with few clear overarching principles.
In this paper, we seek to identify unifying principles, patterns, and
intuitions for scaling Bayesian inference. We review existing work on utilizing
modern computing resources with both MCMC and variational approximation
techniques. From this taxonomy of ideas, we characterize the general principles
that have proven successful for designing scalable inference procedures and
comment on the path forward
INFaaS: A Model-less and Managed Inference Serving System
Despite existing work in machine learning inference serving, ease-of-use and
cost efficiency remain challenges at large scales. Developers must manually
search through thousands of model-variants -- versions of already-trained
models that differ in hardware, resource footprints, latencies, costs, and
accuracies -- to meet the diverse application requirements. Since requirements,
query load, and applications themselves evolve over time, these decisions need
to be made dynamically for each inference query to avoid excessive costs
through naive autoscaling. To avoid navigating through the large and complex
trade-off space of model-variants, developers often fix a variant across
queries, and replicate it when load increases. However, given the diversity
across variants and hardware platforms in the cloud, a lack of understanding of
the trade-off space can incur significant costs to developers.
This paper introduces INFaaS, a managed and model-less system for distributed
inference serving, where developers simply specify the performance and accuracy
requirements for their applications without needing to specify a specific
model-variant for each query. INFaaS generates model-variants, and efficiently
navigates the large trade-off space of model-variants on behalf of developers
to meet application-specific objectives: (a) for each query, it selects a
model, hardware architecture, and model optimizations, (b) it combines VM-level
horizontal autoscaling with model-level autoscaling, where multiple, different
model-variants are used to serve queries within each machine. By leveraging
diverse variants and sharing hardware resources across models, INFaaS achieves
1.3x higher throughput, violates latency objectives 1.6x less often, and saves
up to 21.6x in cost (8.5x on average) compared to state-of-the-art inference
serving systems on AWS EC2
- …