694 research outputs found

    A Framework for the Automatic Physical Configuration and Tuning of a Mysql Community Server

    Get PDF
    Manual physical configuration and tuning of database servers, is a complicated task requiring a high level of expertise. Database administrators must consider numerous possibilities, to determine a candidate configuration for implementation. In recent times database vendors have responded to this problem, providing solutions which can automatically configure and tune their products. Poor configuration choices, resulting in performance degradation commonplace in manual configurations, have been significantly reduced in these solutions. However, no such solution exists for MySQL Community Server. This thesis, proposes a novel framework for automatically tuning a MySQL Community Server. A first iteration of the framework has been built and is presented in this paper together with its performance measurements

    SMOQE: A System for Providing Secure Access to XML

    Get PDF
    XML views have been widely used to enforce access control, support data integration, and speed up query answering. In many applications, e.g., XML security enforcement, it is prohibitively expensive to materialize and maintain a large number of views. Therefore, views are necessarily virtual. An immediate question then is how to answer queries on XML virtual views. A common approach is to rewrite a query on the view to an equivalent one on the underlying document, and evaluate the rewritten query. This is the approach used in the Secure MOdular Query Engine (SMOQE). The demo presents SMOQE, the first system to provide efficient support for answering queries over virtual and possibly recursively defined XML views. We demonstrate a set of novel techniques for the specification of views, the rewriting, evaluation and optimization of XML queries. Moreover, we provide insights into the internals of the engine by a set of visual tools. 1

    Saving Space and Time Using Index Merging

    Get PDF
    Managing digital information is an integral part of our society. Efficient access to data is supported through the use of indices. Although indices can reduce the cost of answering queries, they have two significant drawbacks: they take additional storage space and their maintenance can become a bottleneck. We address these challenges by introducing search data structures that reduce the need for storing redundant data among indices. Our experimental results with the main-memory version of these data structures show that our approach can reduce by half the storage space and can improve performance, where the highest performance improvement is achieved for workloads with high update ratios. Our experimental results with the secondary-storage version of the data structures show that our approach produces a solution that can outperform both IBM DB2 and Microsoft SQL Server on the popular TPC-C workload

    Data Mining the SDSS SkyServer Database

    Full text link
    An earlier paper (Szalay et. al. "Designing and Mining MultiTerabyte Astronomy Archives: The Sloan Digital Sky Survey," ACM SIGMOD 2000) described the Sloan Digital Sky Survey's (SDSS) data management needs by defining twenty database queries and twelve data visualization tasks that a good data management system should support. We built a database and interfaces to support both the query load and also a website for ad-hoc access. This paper reports on the database design, describes the data loading pipeline, and reports on the query implementation and performance. The queries typically translated to a single SQL statement. Most queries run in less than 20 seconds, allowing scientists to interactively explore the database. This paper is an in-depth tour of those queries. Readers should first have studied the companion overview paper Szalay et. al. "The SDSS SkyServer, Public Access to the Sloan Digital Sky Server Data" ACM SIGMOND 2002.Comment: 40 pages, Original source is at http://research.microsoft.com/~gray/Papers/MSR_TR_O2_01_20_queries.do

    Enterprise Data Mining & Machine Learning Framework on Cloud Computing for Investment Platforms

    Get PDF
    Machine Learning and Data Mining are two key components in decision making systems which can provide valuable in-sights quickly into huge data set. Turning raw data into meaningful information and converting it into actionable tasks makes organizations profitable and sustain immense competition. In the past decade we saw an increase in Data Mining algorithms and tools for financial market analysis, consumer products, manufacturing, insurance industry, social networks, scientific discoveries and warehousing. With vast amount of data available for analysis, the traditional tools and techniques are outdated for data analysis and decision support. Organizations are investing considerable amount of resources in the area of Data Mining Frameworks in order to emerge as market leaders. Machine Learning is a natural evolution of Data Mining. The existing Machine Learning techniques rely heavily on the underlying Data Mining techniques in which the Patterns Recognition is an essential component. Building an efficient Data Mining Framework is expensive and usually culminates in multi-year project for the organizations. The organization pay a heavy price for any delay or inefficient Data Mining foundation. In this research, we propose to build a cost effective and efficient Data Mining (DM) and Machine Learning (ML) Framework on cloud computing environment to solve the inherent limitations in the existing design methodologies. The elasticity of the cloud architecture solves the hardware constraint on businesses. Our research is focused on refining and enhancing the current Data Mining frameworks to build an enterprise data mining and machine learning framework. Our initial studies and techniques produced very promising results by reducing the existing build time considerably. Our technique of dividing the DM and ML Frameworks into several individual components (5 sub components) which can be reused at several phases of the final enterprise build is efficient and saves operational costs to the organization. Effective Aggregation using selective cuboids and parallel computations using Azure Cloud Services are few of many proposed techniques in our research. Our research produced a nimble, scalable portable architecture for enterprise wide implementation of DM and ML frameworks

    Poster session: Constrained dynamic physical database design

    Get PDF
    Physical design has always been an important part of database administration. Today's commercial database management systems offer physical design tools, which recommend a physical design for a given workload. However, these tools work only with static workloads and ignore the fact that workloads, and physical designs, may change over time. Research has now begun to focus on dynamic physical design, which can account for time-varying workloads. In this paper, we consider a dynamic but constrained approach to physical design. The goal is to recommend dynamic physical designs that reflect major workload trends but that are not tailored too closely to the details of the input workloads. To achieve this, we constrain the number of changes that are permitted in the recommended design. In this paper we present our definition of the constrained dynamic physical design problem and discuss several techniques for solving it
    corecore