6,778 research outputs found
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
Machine Learning models are often composed of pipelines of transformations.
While this design allows to efficiently execute single model components at
training time, prediction serving has different requirements such as low
latency, high throughput and graceful performance degradation under heavy load.
Current prediction serving systems consider models as black boxes, whereby
prediction-time-specific optimizations are ignored in favor of ease of
deployment. In this paper, we present PRETZEL, a prediction serving system
introducing a novel white box architecture enabling both end-to-end and
multi-model optimizations. Using production-like model pipelines, our
experiments show that PRETZEL is able to introduce performance improvements
over different dimensions; compared to state-of-the-art approaches PRETZEL is
on average able to reduce 99th percentile latency by 5.5x while reducing memory
footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems
Design and Implementation (OSDI), 201
A MULTI-COMMODITY NETWORK FLOW APPROACH FOR SEQUENCING REFINED PRODUCTS IN PIPELINE SYSTEMS
In the oil industry, there is a special class of pipelines used for the transportation of refined products. The problem of sequencing the inputs to be pumped through this type of pipeline seeks to generate the optimal sequence of batches of products and their destination as well as the amount of product to be pumped such that the total operational cost of the system, or another operational objective, is optimized while satisfying the product demands according to the requirements set by the customers. This dissertation introduces a new modeling approach and proposes a solution methodology for this problem capable of dealing with the topology of all the scenarios reported in the literature so far.
The system representation is based on a 1-0 multi commodity network flow formulation that models the dynamics of the system, including aspects such as conservation of product flow constraints at the depots, travel time of products from the refinery to their depot destination and what happens upstream and downstream the line whenever a product is being received at a given depot while another one is being injected into the line at the refinery. It is assumed that the products are already available at the refinery and their demand at each depot is deterministic and known beforehand. The model provides the sequence, the amounts, the destination and the trazability of the shipped batches of different products from their sources to their destinations during the entire horizon planning period while seeking the optimization of pumping and inventory holding costs satisfying the time window constraints.
A survey for the available literature is presented. Given the problem structure, a decomposition based solution procedure is explored with the intention of exploiting the network structure using the network simplex method. A branch and bound algorithm that exploits the dynamics of the system assigning priorities for branching to a selected set of variables is proposed and its computational results for the solution, obtained via GAMS/CPLEX, of the formulation for random instances of the problem of different sizes are presented. Future research directions on this field are proposed
Recommended from our members
Chippe : a system for constraint driven behavioral synthesis
This report describes the Chippe system, gives some background previous work and describes several sample design runs of the system. Also presented are the sources of the design tradeoffs used by Chippe, and overview of the internal design model, and experiences using the system
The "MIND" Scalable PIM Architecture
MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a
Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on
each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND
architecture
- …