5,939 research outputs found
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
Machine Learning models are often composed of pipelines of transformations.
While this design allows to efficiently execute single model components at
training time, prediction serving has different requirements such as low
latency, high throughput and graceful performance degradation under heavy load.
Current prediction serving systems consider models as black boxes, whereby
prediction-time-specific optimizations are ignored in favor of ease of
deployment. In this paper, we present PRETZEL, a prediction serving system
introducing a novel white box architecture enabling both end-to-end and
multi-model optimizations. Using production-like model pipelines, our
experiments show that PRETZEL is able to introduce performance improvements
over different dimensions; compared to state-of-the-art approaches PRETZEL is
on average able to reduce 99th percentile latency by 5.5x while reducing memory
footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems
Design and Implementation (OSDI), 201
Visual Chunking: A List Prediction Framework for Region-Based Object Detection
We consider detecting objects in an image by iteratively selecting from a set
of arbitrarily shaped candidate regions. Our generic approach, which we term
visual chunking, reasons about the locations of multiple object instances in an
image while expressively describing object boundaries. We design an
optimization criterion for measuring the performance of a list of such
detections as a natural extension to a common per-instance metric. We present
an efficient algorithm with provable performance for building a high-quality
list of detections from any candidate set of region-based proposals. We also
develop a simple class-specific algorithm to generate a candidate region
instance in near-linear time in the number of low-level superpixels that
outperforms other region generating methods. In order to make predictions on
novel images at testing time without access to ground truth, we develop
learning approaches to emulate these algorithms' behaviors. We demonstrate that
our new approach outperforms sophisticated baselines on benchmark datasets.Comment: to appear at ICRA 201
- …