413 research outputs found

    Searching of images stored in a database using content and pixel based methods

    Get PDF
    In the paper we consider combined pixel and content based searching of images. We propose an application of Stone’s method of progressive wavelet correlation using Fourier methods for pixel-based searching of images stored in a database. The proposed interface between the Matlab work-space and the database is described. The Oracle Database and the IBM QBIC are used for investigation purposes

    Performance Tuning of Database Systems Using a Context-aware Approach

    Get PDF
    Database system performance problems have a cascading effect into all aspects of an enterprise application. Database vendors and application developers provide guidelines, best practices and even initial database settings for good performance. But database performance tuning is not a one-off task. Database administrators have to keep a constant eye on the database performance as the tuning work carried out earlier could be invalidated due to multitude of reasons. Before engaging in a performance tuning endeavor, a database administrator must prioritize which tuning tasks to carry out first. This prioritization is done based on which tuning action would yield highest performance benefit. However, this prediction may not always be accurate. Experiment-based performance tuning methodologies have been introduced as an alternative to prediction-based performance tuning approaches. Experimenting on a representative system similar to the production one allows a database administrator to accurately gauge the performance gain for a particular tuning task. In this paper we propose a novel approach to experiment-based performance tuning with the use of a context-aware application model. Using a proof-of-concept implementation we show how it could be used to automate the detection of performance changes, experiment creation and evaluate the performance tuning outcomes for mixed workload types through database configuration parameter changes

    Survey of Autonomic Computing and Experiments on JMX-based Autonomic Features

    Get PDF
    Autonomic Computing (AC) aims at solving the problem of managing the rapidly-growing complexity of Information Technology systems, by creating self-managing systems. In this thesis, we have surveyed the progress of the AC field, and studied the requirements, models and architectures of AC. The commonly recognized AC requirements are four properties - self-configuring, self-healing, self-optimizing, and self-protecting. The recommended software architecture is the MAPE-K model containing four modules, namely - monitor, analyze, plan and execute, as well as the knowledge repository. In the modern software marketplace, Java Management Extensions (JMX) has facilitated one function of the AC requirements - monitoring. Using JMX, we implemented a package that attempts to assist programming for AC features including socket management, logging, and recovery of distributed computation. In the experiments, we have not only realized the powerful Java capabilities that are unknown to many educators, we also illustrated the feasibility of learning AC in senior computer science courses

    Database Performance Tuning Methods for Manufacturing Execution System

    Get PDF
    In manufacturing industry where data are produced and shared every day, data volumes could be large enough for the database performance to become an issue. Manufacturing Execution System (MES) is such a system that cannot tolerate with poor database performance as the system relies heavily on real-time reporting that requires instance query responses. Manufacturing products’ quality and production targets can be affected as the result of delayed queries. Therefore, the need to maintain the acceptable level of database performance in this domain is crucial. One task in maintaining database performance is identification and diagnosis of the root causes that may cause delayed queries. Poor query design has been identified as one major cause of delayed queries that affect real-time reporting. Nevertheless, as various methods available to deal with poor query design, it is important for a database administrator to decide the method or combination of methods that work best. In this paper, we present a case study on the methods used by a real manufacturing industry company called as Silterra and the methods proposed in the literature that deal with poor query design. For each method, we elicit its strength and weaknesses and analyse the practical implementation of it

    Design and Implementation of an Enterprise Data Warehouse

    Get PDF
    The reporting and sharing of information has been synonymous with databases as long as there have been systems to host them. Now more than ever, users expect the sharing of information in an immediate, efficient, and secure manner. However, due to the sheer number of databases within the enterprise, getting the data in an effective fashion requires a coordinated effort between the existing systems. There is a very real need today to have a single location for the storage and sharing of data that users can easily utilize to make improved business decisions, rather than trying to traverse the multiple databases that exist today and can do so by using an enterprise data warehouse. The Thesis involves a description of data warehousing techniques, design, expectations, and challenges regarding data cleansing and transforming existing data, as well as other challenges associated with extracting from transactional databases. The Thesis also includes a technical piece discussing database requirements and technologies used to create and refresh the data warehouse. The Thesis discusses how data from databases and other data warehouses could integrate. In addition, there is discussion of specific data marts within the warehouse to satisfy a specific need. Finally, there are explanations for how users will consume the data in the enterprise data warehouse, such as through reporting and other business intelligence. This discussion also includes the topics of system architecture of how data from databases and other data warehouses from different departments could integrate. An Enterprise Data Warehouse prototype developed will show how a pair of different databases undergoes the Extract, Transform and Load (ETL) process and loaded into an actual set of star schemas then makes the reporting easier. Separately, an important piece of this thesis takes an actual example of data and compares the performance between them by running the same queries against separate databases, one transactional and one data warehouse. As the queries expand in difficulty, larger grows the gap between the actual recorded times of running that same query in the different environments

    A Framework for the Automatic Physical Configuration and Tuning of a Mysql Community Server

    Get PDF
    Manual physical configuration and tuning of database servers, is a complicated task requiring a high level of expertise. Database administrators must consider numerous possibilities, to determine a candidate configuration for implementation. In recent times database vendors have responded to this problem, providing solutions which can automatically configure and tune their products. Poor configuration choices, resulting in performance degradation commonplace in manual configurations, have been significantly reduced in these solutions. However, no such solution exists for MySQL Community Server. This thesis, proposes a novel framework for automatically tuning a MySQL Community Server. A first iteration of the framework has been built and is presented in this paper together with its performance measurements

    SODA: Generating SQL for Business Users

    Full text link
    The purpose of data warehouses is to enable business analysts to make better decisions. Over the years the technology has matured and data warehouses have become extremely successful. As a consequence, more and more data has been added to the data warehouses and their schemas have become increasingly complex. These systems still work great in order to generate pre-canned reports. However, with their current complexity, they tend to be a poor match for non tech-savvy business analysts who need answers to ad-hoc queries that were not anticipated. This paper describes the design, implementation, and experience of the SODA system (Search over DAta Warehouse). SODA bridges the gap between the business needs of analysts and the technical complexity of current data warehouses. SODA enables a Google-like search experience for data warehouses by taking keyword queries of business users and automatically generating executable SQL. The key idea is to use a graph pattern matching algorithm that uses the metadata model of the data warehouse. Our results with real data from a global player in the financial services industry show that SODA produces queries with high precision and recall, and makes it much easier for business users to interactively explore highly-complex data warehouses.Comment: VLDB201

    SiteWit Corporation: SQL or NoSQL that is the Question

    Get PDF
    This teaching case focuses on a start-up company in the Web analytics and online advertising space, which faces a database scaling challenge. The case covers the rapidly emerging NoSQL database products that can be used to implement very large distributed databases. These are exciting times in the database marketplace, with a flock of new companies offering scalable database systems for the cloud. These products challenge the existing relational database vendors that have come to dominate the market. The case outlines four potential solutions and asks students to make a choice or suggest a different alternative

    Data Profiling in Cloud Migration: Data Quality Measures while Migrating Data from a Data Warehouse to the Google Cloud Platform

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn today times, corporations have gained a vast interest in data. More and more, companies realized that the key to improving their efficiency and effectiveness and understanding their customers’ needs and preferences better was reachable by mining data. However, as the amount of data grow, so must the companies necessities for storage capacity and ensuring data quality for more accurate insights. As such, new data storage methods must be considered, evolving from old ones, still keeping data integrity. Migrating a company’s data from an old method like a Data Warehouse to a new one, Google Cloud Platform is an elaborate task. Even more so when data quality needs to be assured and sensible data, like Personal Identifiable Information, needs to be anonymized in a Cloud computing environment. To ensure these points, profiling data, before or after it migrated, has a significant value by design a profile for the data available in each data source (e.g., Databases, files, and others) based on statistics, metadata information, and pattern rules. Thus, ensuring data quality is within reasonable standards through statistics metrics, and all Personal Identifiable Information is identified and anonymized accordingly. This work will reflect the required process of how profiling Data Warehouse data can improve data quality to better migrate to the Cloud
    • …
    corecore