393 research outputs found

    Reify Your Collection Queries for Modularity and Speed!

    Full text link
    Modularity and efficiency are often contradicting requirements, such that programers have to trade one for the other. We analyze this dilemma in the context of programs operating on collections. Performance-critical code using collections need often to be hand-optimized, leading to non-modular, brittle, and redundant code. In principle, this dilemma could be avoided by automatic collection-specific optimizations, such as fusion of collection traversals, usage of indexing, or reordering of filters. Unfortunately, it is not obvious how to encode such optimizations in terms of ordinary collection APIs, because the program operating on the collections is not reified and hence cannot be analyzed. We propose SQuOpt, the Scala Query Optimizer--a deep embedding of the Scala collections API that allows such analyses and optimizations to be defined and executed within Scala, without relying on external tools or compiler extensions. SQuOpt provides the same "look and feel" (syntax and static typing guarantees) as the standard collections API. We evaluate SQuOpt by re-implementing several code analyses of the Findbugs tool using SQuOpt, show average speedups of 12x with a maximum of 12800x and hence demonstrate that SQuOpt can reconcile modularity and efficiency in real-world applications.Comment: 20 page

    Tabling with Support for Relational Features in a Deductive Database

    Get PDF
    Tabling has been acknowledged as a useful technique in the logic programming arena for enhancing both performance and declarative properties of programs. As well, deductive database implementations benefit from this technique for implementing query solving engines. In this paper, we show how unusual operations in deductive systems can be integrated with tabling.Such operations come from relational database systems in the form of null-related (outer) joins, duplicate support and duplicate elimination. The proposal has been implemented as a proof of concept rather than an efficient system in the Datalog Educational System (DES) using Prolog as a development language and its dynamic database

    Theoretical optimal trajectories for reducing the environmental impact of commercial aircraft operations

    Get PDF
    This work describes initial results obtained from an ongoing research involving the development of optimization algorithms which are capable of performing multi-disciplinary aircraft trajectory optimization processes. A short description of both the rationale behind the initial selection of a suitable optimization technique and the status of the optimization algorithms is firstly presented. The optimization algorithms developed are subsequently utilized to analyze different case studies involving one or more flight phases present in actual aircraft flight profiles. Several optimization processes focusing on the minimization of total flight time, fuel burned and oxides of nitrogen (NOx) emissions are carried out and their results are presented and discussed. When compared with others obtained using commercially available optimizers, results of these optimization processes show atisfactory level of accuracy (average discrepancies ~2%). It is expected that these optimization algorithms can be utilized in future to efficiently compute realistic, optimal and ‘greener’ aircraft trajectories, thereby minimizing the environmental impact of commercial aircraft operations

    Large Language Models for Software Engineering: A Systematic Literature Review

    Full text link
    Large Language Models (LLMs) have significantly impacted numerous domains, notably including Software Engineering (SE). Nevertheless, a well-rounded understanding of the application, effects, and possible limitations of LLMs within SE is still in its early stages. To bridge this gap, our systematic literature review takes a deep dive into the intersection of LLMs and SE, with a particular focus on understanding how LLMs can be exploited in SE to optimize processes and outcomes. Through a comprehensive review approach, we collect and analyze a total of 229 research papers from 2017 to 2023 to answer four key research questions (RQs). In RQ1, we categorize and provide a comparative analysis of different LLMs that have been employed in SE tasks, laying out their distinctive features and uses. For RQ2, we detail the methods involved in data collection, preprocessing, and application in this realm, shedding light on the critical role of robust, well-curated datasets for successful LLM implementation. RQ3 allows us to examine the specific SE tasks where LLMs have shown remarkable success, illuminating their practical contributions to the field. Finally, RQ4 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE, as well as the common techniques related to prompt optimization. Armed with insights drawn from addressing the aforementioned RQs, we sketch a picture of the current state-of-the-art, pinpointing trends, identifying gaps in existing research, and flagging promising areas for future study

    An Algebraic Approach to XQuery Optimization

    Get PDF
    As more data is stored in XML and more applications need to process this data, XML query optimization becomes performance critical. While optimization techniques for relational databases have been developed over the last thirty years, the optimization of XML queries poses new challenges. Query optimizers for XQuery, the standard query language for XML data, need to consider both document order and sequence order. Nevertheless, algebraic optimization proved powerful in query optimizers in relational and object oriented databases. Thus, this dissertation presents an algebraic approach to XQuery optimization. In this thesis, an algebra over sequences is presented that allows for a simple translation of XQuery into this algebra. The formal definitions of the operators in this algebra allow us to reason formally about algebraic optimizations. This thesis leverages the power of this formalism when unnesting nested XQuery expressions. In almost all cases unnesting nested queries in XQuery reduces query execution times from hours to seconds or milliseconds. Moreover, this dissertation presents three basic algebraic patterns of nested queries. For every basic pattern a decision tree is developed to select the most effective unnesting equivalence for a given query. Query unnesting extends the search space that can be considered during cost-based optimization of XQuery. As a result, substantially more efficient query execution plans may be detected. This thesis presents two more important cases where the number of plan alternatives leads to substantially shorter query execution times: join ordering and reordering location steps in path expressions. Our algebraic framework detects cases where document order or sequence order is destroyed. However, state-of-the-art techniques for order optimization in cost-based query optimizers have efficient mechanisms to repair order in these cases. The results obtained for query unnesting and cost-based optimization of XQuery underline the need for an algebraic approach to XQuery optimization for efficient XML query processing. Moreover, they are applicable to optimization in relational databases where order semantics are considered

    Histogram techniques for cost estimation in query optimization.

    Get PDF
    Yu Xiaohui.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 98-115).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- Related Work --- p.6Chapter 2.1 --- Query Optimization --- p.6Chapter 2.2 --- Query Rewriting --- p.8Chapter 2.2.1 --- Optimizing Multi-Block Queries --- p.8Chapter 2.2.2 --- Semantic Query Optimization --- p.13Chapter 2.2.3 --- Query Rewriting in Starburst --- p.15Chapter 2.3 --- Plan Generation --- p.16Chapter 2.3.1 --- Dynamic Programming Approach --- p.16Chapter 2.3.2 --- Join Query Processing --- p.17Chapter 2.3.3 --- Queries with Aggregates --- p.23Chapter 2.4 --- Statistics and Cost Estimation --- p.24Chapter 2.5 --- Histogram Techniques --- p.27Chapter 2.5.1 --- Definitions --- p.28Chapter 2.5.2 --- Trivial Histograms --- p.29Chapter 2.5.3 --- Heuristic-based Histograms --- p.29Chapter 2.5.4 --- V-Optimal Histograms --- p.32Chapter 2.5.5 --- Wavelet-based Histograms --- p.35Chapter 2.5.6 --- Multidimensional Histograms --- p.35Chapter 2.5.7 --- Global Histograms --- p.37Chapter 3 --- New Histogram Techniques --- p.39Chapter 3.1 --- Piecewise Linear Histograms --- p.39Chapter 3.1.1 --- Construction --- p.41Chapter 3.1.2 --- Usage --- p.43Chapter 3.1.3 --- Error Measures --- p.43Chapter 3.1.4 --- Experiments --- p.45Chapter 3.1.5 --- Conclusion --- p.51Chapter 3.2 --- A-Optimal Histograms --- p.54Chapter 3.2.1 --- A-Optimal(mean) Histograms --- p.56Chapter 3.2.2 --- A-Optimal(median) Histograms --- p.58Chapter 3.2.3 --- A-Optimal(median-cf) Histograms --- p.59Chapter 3.2.4 --- Experiments --- p.60Chapter 4 --- Global Histograms --- p.64Chapter 4.1 --- Wavelet-based Global Histograms --- p.65Chapter 4.1.1 --- Wavelet-based Global Histograms I --- p.66Chapter 4.1.2 --- Wavelet-based Global Histograms II --- p.68Chapter 4.2 --- Piecewise Linear Global Histograms --- p.70Chapter 4.3 --- A-Optimal Global Histograms --- p.72Chapter 4.3.1 --- Experiments --- p.74Chapter 5 --- Dynamic Maintenance --- p.81Chapter 5.1 --- Problem Definition --- p.83Chapter 5.2 --- Refining Bucket Coefficients --- p.84Chapter 5.3 --- Restructuring --- p.86Chapter 5.4 --- Experiments --- p.91Chapter 6 --- Conclusions --- p.95Bibliography --- p.9
    • …
    corecore