1,973 research outputs found

    Self-Learning Cloud Controllers: Fuzzy Q-Learning for Knowledge Evolution

    Get PDF
    Cloud controllers aim at responding to application demands by automatically scaling the compute resources at runtime to meet performance guarantees and minimize resource costs. Existing cloud controllers often resort to scaling strategies that are codified as a set of adaptation rules. However, for a cloud provider, applications running on top of the cloud infrastructure are more or less black-boxes, making it difficult at design time to define optimal or pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions often is delegated to the cloud application. Yet, in most cases, application developers in turn have limited knowledge of the cloud infrastructure. In this paper, we propose learning adaptation rules during runtime. To this end, we introduce FQL4KE, a self-learning fuzzy cloud controller. In particular, FQL4KE learns and modifies fuzzy rules at runtime. The benefit is that for designing cloud controllers, we do not have to rely solely on precise design-time knowledge, which may be difficult to acquire. FQL4KE empowers users to specify cloud controllers by simply adjusting weights representing priorities in system goals instead of specifying complex adaptation rules. The applicability of FQL4KE has been experimentally assessed as part of the cloud application framework ElasticBench. The experimental results indicate that FQL4KE outperforms our previously developed fuzzy controller without learning mechanisms and the native Azure auto-scaling

    Scalable Statistical Modeling and Query Processing over Large Scale Uncertain Databases

    Get PDF
    The past decade has witnessed a large number of novel applications that generate imprecise, uncertain and incomplete data. Examples include monitoring infrastructures such as RFIDs, sensor networks and web-based applications such as information extraction, data integration, social networking and so on. In my dissertation, I addressed several challenges in managing such data and developed algorithms for efficiently executing queries over large volumes of such data. Specifically, I focused on the following challenges. First, for meaningful analysis of such data, we need the ability to remove noise and infer useful information from uncertain data. To address this challenge, I first developed a declarative system for applying dynamic probabilistic models to databases and data streams. The output of such probabilistic modeling is probabilistic data, i.e., data annotated with probabilities of correctness/existence. Often, the data also exhibits strong correlations. Although there is prior work in managing and querying such probabilistic data using probabilistic databases, those approaches largely assume independence and cannot handle probabilistic data with rich correlation structures. Hence, I built a probabilistic database system that can manage large-scale correlations and developed algorithms for efficient query evaluation. Our system allows users to provide uncertain data as input and to specify arbitrary correlations among the entries in the database. In the back end, we represent correlations as a forest of junction trees, an alternative representation for probabilistic graphical models (PGM). We execute queries over the probabilistic database by transforming them into message passing algorithms (inference) over the junction tree. However, traditional algorithms over junction trees typically require accessing the entire tree, even for small queries. Hence, I developed an index data structure over the junction tree called INDSEP that allows us to circumvent this process and thereby scalably evaluate inference queries, aggregation queries and SQL queries over the probabilistic database. Finally, query evaluation in probabilistic databases typically returns output tuples along with their probability values. However, the existing query evaluation model provides very little intuition to the users: for instance, a user might want to know Why is this tuple in my result? or Why does this output tuple have such high probability? or Which are the most influential input tuples for my query ?'' Hence, I designed a query evaluation model, and a suite of algorithms, that provide users with explanations for query results, and enable users to perform sensitivity analysis to better understand the query results

    Using features for automated problem solving

    Get PDF
    We motivate and present an architecture for problem solving where an abstraction layer of "features" plays the key role in determining methods to apply. The system is presented in the context of theorem proving with Isabelle, and we demonstrate how this approach to encoding control knowledge is expressively different to other common techniques. We look closely at two areas where the feature layer may offer benefits to theorem proving — semi-automation and learning — and find strong evidence that in these particular domains, the approach shows compelling promise. The system includes a graphical theorem-proving user interface for Eclipse ProofGeneral and is available from the project web page, http://feasch.heneveld.org

    Learning Models over Relational Data using Sparse Tensors and Functional Dependencies

    Full text link
    Integrated solutions for analytics over relational databases are of great practical importance as they avoid the costly repeated loop data scientists have to deal with on a daily basis: select features from data residing in relational databases using feature extraction queries involving joins, projections, and aggregations; export the training dataset defined by such queries; convert this dataset into the format of an external learning tool; and train the desired model using this tool. These integrated solutions are also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This article introduces a unified framework for training and evaluating a class of statistical learning models over relational databases. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from database theory such as schema information, query structure, functional dependencies, recent advances in query evaluation algorithms, and from linear algebra such as tensor and matrix operations, one can formulate relational analytics problems and design efficient (query and data) structure-aware algorithms to solve them. This theoretical development informed the design and implementation of the AC/DC system for structure-aware learning. We benchmark the performance of AC/DC against R, MADlib, libFM, and TensorFlow. For typical retail forecasting and advertisement planning applications, AC/DC can learn polynomial regression models and factorization machines with at least the same accuracy as its competitors and up to three orders of magnitude faster than its competitors whenever they do not run out of memory, exceed 24-hour timeout, or encounter internal design limitations.Comment: 61 pages, 9 figures, 2 table

    Investigation into the use of evolutionary algorithms for fully automated planning

    Get PDF
    This thesis presents a new approach to the Arti cial Intelligence (AI) problem of fully automated planning. Planning is the act of deliberation before acting that guides rational behaviour and is a core area of AI. Many practical real-world problems can be classed as planning problems, therefore practical and theoretical developments in AI planning are well motivated. Unfortunately, planning for even toy domains is hard, many different search algorithms have been proposed, and new approaches are actively encouraged. The approach taken in this thesis is to adopt ideas from Evolutionary Algorithms (EAs) and apply the techniques to fully automated plan synthesis. EA methods have enjoyed great success in many problem areas of AI. They are a new kind of search technique that have their foundation in evolution. Previous attempts to apply EAs to plan synthesis have promised encouraging results, but have been ad-hoc and piecemeal. This thesis thoroughly investigates the approach of applying evolutionary search to the fully automated planning problem. This is achieved by developing and modifying a proof of concept planner called GENPLAN. Before EA-based systems can be used, a thorough examination of various parameter settings must be explored. Once this was completed, the performance of GENPLAN was evaluated using a selection of benchmark domains and other competition style planners. The dif culties raised by the benchmark domains and the extent to which they cause problems for the approach are highlighted along with problems associated with EA search. Modi cations are proposed and experimented with in an attempt to alleviate some of the identi ed problems. EAs offer a exible framework for fully automated planning, but demonstrate a clear weakness across a range of currently used benchmark domains for plan synthesis

    Automaton Meet Algebra: A Hybrid Paradigm for Efficiently Processing XQuery over XML Stream

    Get PDF
    XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data streams. The automaton paradigm is naturally suited for pattern retrieval on tokenized XML streams, but requires patches for implementing the filtering or restructuring functionalities common for the XML query languages. In contrast, the algebraic paradigm is well-established for processing self-contained tuples. However, it does not traditionally support token inputs. This dissertation proposes a framework called Raindrop, which accommodates both the automaton and algebra paradigms to take advantage of both. First, we propose an architecture for Raindrop. Raindrop is an algebra framework that models queries at different abstraction levels. We represent the token-based automaton computations as an algebraic subplan at the high level while exposing the automaton details at the low level. The algebraic subplan modeling automaton computations can thus be integrated with the algebraic subplan modeling the non-automaton computations. Second, we explore a novel optimization opportunity. Other XML stream processing systems always retrieve all the patterns in a query in the automaton. In contrast, Raindrop allows a plan to retrieve some of the pattern retrieval in the automaton and some out of the automaton. This opens up an automaton-in-or-out optimization opportunity. We study this optimization in two types of run-time environments, one with stable data characteristics and one with fluctuating data characteristics. We provide search strategies catering to each environment. We also describe how to migrate from a currently running plan to a new plan at run-time. Third, we optimize the automaton computations using the schema knowledge. A set of criteria are established to decide what schema constraints are useful to a given query. Optimization rules utilizing different types of schema constraints are proposed based on the criteria. We design a rule application algorithm which ensures both completeness (i.e., no optimization is missed) and minimality (i.e., no redundant optimization is introduced). The experimentations on both real and synthetic data illustrate that these techniques bring significant performance improvement with little overhead

    On the Completeness of Replacing Primitive Actions with Macro-actions and its Generalization to Planning Operators and Macro-operators

    Get PDF
    Automated planning, which deals with the problem of generating sequences of actions, is an emerging research topic due to its potentially wide range of real-world application domains. As well as developing and improving planning engines, the acquisition of domain-specific knowledge is a promising way to improve the planning process. Domain-specific knowledge can be encoded into the modelling language that a range of planning engines can accept. This makes encoding domain-specific knowledge planner-independent, and entails reformulating the domain models and/or problem specifications. While many encouraging practical results have been derived from such reformulation methods (e.g. learning macro-actions), little attention has been paid to the theoretical properties such as completeness (keeping solvability of reformulated problems). In this paper, we focus on a special case – removing primitive actions replaced by macro-actions. We provide a theoretical study and come up with conditions under which it is safe to remove primitive actions, so completeness of reformulation is preserved. We extend this study also for planning operators (actions are instances of operators)

    Modeling of systems

    Get PDF
    The handbook contains the fundamentals of modeling of complex systems. The classification of mathematical models is represented and the methods of their construction are given. The analytical modeling of the basic types of processes in the complex systems is considered. The principles of simulation, statistical and business processes modeling are described. The handbook is oriented on students of higher education establishments that obtain a degree in directions of “Software engineering” and “Computer science” as well as on lecturers and specialists in the domain of computer modeling

    Mining Query Plans for Finding Candidate Queries and Sub-Queries for Materialized Views in BI Systems Without Cube Generation

    Get PDF
    Materialized views are important for optimizing Business Intelligence (BI) systems when they are designed without data cubes. Selecting candidate queries from large number of queries for materialized views is a challenging task. Most of the work done in the past involves finding out frequent queries from the past workload and creating materialized views from such queries by either manually analyzing workload or using approximate string matching algorithms using query text. Most of the existing methods suggest complete queries but ignore query components such as sub queries for creation of materialized views. This paper presents a novel method to determine on which queries and query components materialized views can be created to optimize aggregate and join queries by mining database of query execution plans which are in the form of binary trees. The proposed algorithm showed significant improvement in terms of more number of optimized queries because it is using the execution plan tree of the query as a basis of selection of query to be optimized using materialized views rather than choosing query text which is used by traditional methods. For selecting a correct set of queries to be optimized using materialized views, the paper proposes efficient specialized frequent tree component mining algorithm with novel heuristics to prune search space. These frequent components are used to determine the possible set of candidate queries for creation of materialized views. Experimentation on standard, real and synthetic data sets, and also the theoretical basis, proved that the proposed method is able to optimize a large number of queries with less number of materialized views and showed a significant improvement in performance compared to traditional methods

    Management Information Sources and Corporate Intelligence Systems

    Get PDF
    In this book the word “intelligence” is used in several different contexts. Intelligence can refer to the process of gathering data; it can refer to the data itself; and it can refer to the application of knowledge to product useful information from the data. We will see in this chapter how the computer can be used in business to further all three aspects of intelligence: capturing the data, storing the data in an accessible form, and adding value to the data by transforming it into useful information for decision making. This chapter is organized according to these three areas of computer support for business intelligence: 1. Transaction processing and intelligence capture 2. Data-base management and intelligence storage and retrieval 3. Decision support systems and intelligence processing We provide an overview of the concepts in transaction processing, data-base management, and decision support systems. References are listed in each of these areas for further details. Our purpose is to provide the perspective for the executive to detenmne the use of these concepts for his or her company and to understand the choices open to him or her in today’s technology
    corecore