39 research outputs found

    System Design for Intelligent Web Services

    Full text link
    The devices and software systems we interact with on a daily basis are more intelligent than ever. The computing required to deliver these experiences for end-users is hosted in Warehouse Scale Computers (WSC) where intelligent web services are employed to process user images, speech, and text. These intelligent web services are emerging as one of the fastest growing class of web services. Given the expectation of users moving forward is an experience that uses intelligent web services, the demand for this type of processing is only going to increase. However, today’s cloud infrastructures, tuned for traditional workloads such as Web Search and social networks, are not adequately equipped to sustain this increase in demand. This dissertation shows that applications that use intelligent web service processing on the path of a single query require orders of magnitude more computational resources than traditional Web Search. Intelligent web services use large pretrained machine learning models to process image, speech, and text based inputs and generate a prediction. As this dissertation investigates, we find that hosting intelligent web services in today’s infrastructures exposes three critical problems: 1) current infrastructures are computationally inadequate to host this new class of services, 2) system designers are unaware of the bottlenecks exposed by these services and the implications on future designs, 3) the rapid algorithmic churn of these intelligent services deprecates current designs at an even faster rate. This dissertation investigates and addresses each of these problems. After building a representative workload to show the computational resources required by an application composed of three intelligent web services, this dissertation first argues that hardware acceleration is required on the path of a query to sustain demand moving forward. We show that GPU- and FPGA-accelerated servers can improve the query latency on average by 10x and 16x. Leveraging the latency reduction, GPU- and FPGA-accelerated servers reduce the Total Cost of Ownership (TCO) by 2.6x and 1.4x, respectively. Second, we focus on Deep Neural Networks (DNN), a state-of-the- art algorithm for intelligent web services and design a DNN-as-a-Service infrastructure enabling application-agnostic acceleration and single-point of optimization. We identify compute bottlenecks that inform the design of a Graphics Processing Unit (GPU) based system; addressing the compute bottlenecks translates to a throughput improvement of 133x across seven DNN based applications. GPU-enabled datacenters show a TCO improvement over CPU-only designs by 4-20x. Finally, we design a runtime system based on a GPU equipped server that improves current systems accounting for recent advances in intelligent web service algorithms. Specifically, we identify asynchronous processing key for accelerating dynamically configured in- telligent services. We achieve on average 7.6x throughput improvements over an optimized CPU baseline and 2.8x over the current GPU system. By thoroughly addressing these problems, we produce designs for WSCs that are equipped to handle the future demand for intelligent web services. The investigations in this thesis address significant computational bottlenecks and lead to system designs that are more efficient and cost-effective for this new class of web services.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137055/1/jahausw_1.pd

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurements of top-quark pair differential cross-sections in the eμe\mu channel in pppp collisions at s=13\sqrt{s} = 13 TeV using the ATLAS detector

    Get PDF

    Measurement of the W boson polarisation in ttˉt\bar{t} events from pp collisions at s\sqrt{s} = 8 TeV in the lepton + jets channel with ATLAS

    Get PDF

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF

    Search for dark matter in association with a Higgs boson decaying to bb-quarks in pppp collisions at s=13\sqrt s=13 TeV with the ATLAS detector

    Get PDF

    Charged-particle distributions at low transverse momentum in s=13\sqrt{s} = 13 TeV pppp interactions measured with the ATLAS detector at the LHC

    Get PDF
    corecore