35,161 research outputs found

    VIRTUAL DESIGN TO HIDE THE COMPLEXITIES AND BURDENS OF THE USERS IN OPEN NETS

    Get PDF
    The Consumer Is Just Needed To Submit An Sql-Like Query And Also The System Takes Down To Compiling The Query, Generating The Execution Plan And Evaluating Within The Crowd Sourcing Marketplace. We Read The Query Optimization Condition In Declarative Crowd Sourcing Systems. Declarative Crowd Sourcing Is Made To Hide The Reasons As Well As Reducing The Consumer The Responsibility Of Coping With Everyone Else. Within This Paper, We Advise Crowdop, An Expense-Based Query Optimization Method For Declarative Crowd Sourcing Systems. Crowdop Views Both Cost And Latency Within The Query Optimization Objectives And Generates Query Plans That Offer A Great Balance Between Your Cost And Latency. We Develop Efficient Algorithms Within The Crowdop For Optimizing Three Kinds Of Queries: Selection Queries, Join Queries And Sophisticated Selection-Join Queries. We Validate Our Approach Via Extensive Experiments By Simulation In Addition To Using The Real Crowd On Amazon . Com Mechanical Turk. Confirmed Query Might Have Several Execution Plans And Also The Improvement In Crowd Sourcing Cost Between Your Best And Also The Worst Plans Might Be Several Orders Of Magnitude. Therefore, As With Relational Database Systems, Query Optimization Is Essential To Crowd Sourcing Systems That Offer Declarative Query Interfaces

    ReStore: Reusing Results of MapReduce Jobs

    Full text link
    Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop. Users of MapReduce often have analysis tasks that are too complex to express as individual MapReduce jobs. Instead, they use high-level query languages such as Pig, Hive, or Jaql to express their complex tasks. The compilers of these languages translate queries into workflows of MapReduce jobs. Each job in these workflows reads its input from the distributed file system used by the MapReduce system and produces output that is stored in this distributed file system and read as input by the next job in the workflow. The current practice is to delete these intermediate results from the distributed file system at the end of executing the workflow. One way to improve the performance of workflows of MapReduce jobs is to keep these intermediate results and reuse them for future workflows submitted to the system. In this paper, we present ReStore, a system that manages the storage and reuse of such intermediate results. ReStore can reuse the output of whole MapReduce jobs that are part of a workflow, and it can also create additional reuse opportunities by materializing and storing the output of query execution operators that are executed within a MapReduce job. We have implemented ReStore as an extension to the Pig dataflow system on top of Hadoop, and we experimentally demonstrate significant speedups on queries from the PigMix benchmark.Comment: VLDB201

    Deductive Optimization of Relational Data Storage

    Full text link
    Optimizing the physical data storage and retrieval of data are two key database management problems. In this paper, we propose a language that can express a wide range of physical database layouts, going well beyond the row- and column-based methods that are widely used in database management systems. We use deductive synthesis to turn a high-level relational representation of a database query into a highly optimized low-level implementation which operates on a specialized layout of the dataset. We build a compiler for this language and conduct experiments using a popular database benchmark, which shows that the performance of these specialized queries is competitive with a state-of-the-art in memory compiled database system

    06472 Abstracts Collection - XQuery Implementation Paradigms

    Get PDF
    From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

    Full text link
    Machine Learning models are often composed of pipelines of transformations. While this design allows to efficiently execute single model components at training time, prediction serving has different requirements such as low latency, high throughput and graceful performance degradation under heavy load. Current prediction serving systems consider models as black boxes, whereby prediction-time-specific optimizations are ignored in favor of ease of deployment. In this paper, we present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5x while reducing memory footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 201
    • …
    corecore