43 research outputs found

    A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

    Full text link
    The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

    Database server workload characterization in an e-commerce environment

    Get PDF
    A typical E-commerce system that is deployed on the Internet has multiple layers that include Web users, Web servers, application servers, and a database server. As the system use and user request frequency increase, Web/application servers can be scaled up by replication. A load balancing proxy can be used to route user requests to individual machines that perform the same functionality. To address the increasing workload while avoiding replicating the database server, various dynamic caching policies have been proposed to reduce the database workload in E-commerce systems. However, the nature of the changes seen by the database server as a result of dynamic caching remains unknown. A good understanding of this change is fundamental for tuning a database server to get better performance. In this study, the TPC-W (a transactional Web E-commerce benchmark) workloads on a database server are characterized under two different dynamic caching mechanisms, which are generalized and implemented as query-result cache and table cache. The characterization focuses on response time, CPU computation, buffer pool references, disk I/O references, and workload classification. This thesis combines a variety of analysis techniques: simulation, real time measurement and data mining. The experimental results in this thesis reveal some interesting effects that the dynamic caching has on the database server workload characteristics. The main observations include: (a) dynamic cache can considerably reduce the CPU usage of the database server and the number of database page references when it is heavily loaded; (b) dynamic cache can also reduce the database reference locality, but to a smaller degree than that reported in file servers. The data classification results in this thesis show that with dynamic cache, the database server sees TPC-W profiles more like on-line transaction processing workloads

    Cache-and-query for wide area sensor databases

    Get PDF

    Managing cache for efficient query processing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Adaptive P2P platform for data sharing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    SocialTrove: A Self-summarizing Storage Service for Social Sensing

    Get PDF
    The increasing availability of smartphones, cameras, and wearables with instant data sharing capabilities, and the exploitation of social networks for information broadcast, heralds a future of real-time information overload. With the growing excess of worldwide streaming data, such as images, geotags, text annotations, and sensory measurements, an increasingly common service will become one of data summarization. The objective of such a service will be to obtain a representative sampling of large data streams at a configurable granularity, in real-time, for subsequent consumption by a range of data-centric applications. This paper describes a general-purpose self-summarizing storage service, called SocialTrove, for social sensing applications. The service summarizes data streams from human sources, or sensors in their possession, by hierarchically clustering received information in accordance with an application-specific distance metric. It then serves a sampling of produced clusters at a configurable granularity in response to application queries. While SocialTrove is a general service, we illustrate its functionality and evaluate it in the specific context of workloads collected from Twitter. Results show that SocialTrove supports a high query throughput, while maintaining a low access latency to the produced real-time application-specific data summaries. As a specific application case-study, we implement a fact-finding service on top of SocialTrove.Army Research Laboratory, Cooperative Agreement W911NF-09-2-0053DTRA grant HDTRA1-10-1-0120NSF grants CNS 13-29886, CNS 09-58314, CNS 10-35736Ope

    Dynamic data consistency maintenance in peer-to-peer caching system

    Get PDF
    Master'sMASTER OF SCIENC

    Sixth Goddard Conference on Mass Storage Systems and Technologies Held in Cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems

    Get PDF
    This document contains copies of those technical papers received in time for publication prior to the Sixth Goddard Conference on Mass Storage Systems and Technologies which is being held in cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems at the University of Maryland-University College Inn and Conference Center March 23-26, 1998. As one of an ongoing series, this Conference continues to provide a forum for discussion of issues relevant to the management of large volumes of data. The Conference encourages all interested organizations to discuss long term mass storage requirements and experiences in fielding solutions. Emphasis is on current and future practical solutions addressing issues in data management, storage systems and media, data acquisition, long term retention of data, and data distribution. This year's discussion topics include architecture, tape optimization, new technology, performance, standards, site reports, vendor solutions. Tutorials will be available on shared file systems, file system backups, data mining, and the dynamics of obsolescence

    A Forensic Web Log Analysis Tool: Techniques and Implementation

    Get PDF
    Methodologies presently in use to perform forensic analysis of web applications are decidedly lacking. Although the number of log analysis tools available is exceedingly large, most only employ simple statistical analysis or rudimentary search capabilities. More precisely these tools were not designed to be forensically capable. The threat of online assault, the ever growing reliance on the performance of necessary services conducted online, and the lack of efficient forensic methods in this area provide a background outlining the need for such a tool. The culmination of study emanating from this thesis not only presents a forensic log analysis framework, but also outlines an innovative methodology of analyzing log files based on a concept that uses regular expressions, and a variety of solutions to problems associated with existing tools. The implementation is designed to detect critical web application security flaws gleaned from event data contained within the access log files of the underlying Apache Web Service (AWS). Of utmost importance to a forensic investigator or incident responder is the generation of an event timeline preceeding the incident under investigation. Regular expressions power the search capability of our framework by enabling the detection of a variety of injection-based attacks that represent significant timeline interactions. The knowledge of the underlying event structure of each access log entry is essential to efficiently parse log files and determine timeline interactions. Another feature added to our tool includes the ability to modify, remove, or add regular expressions. This feature addresses the need for investigators to adapt the environment to include investigation specific queries along with suggested default signatures. The regular expressions are signature definitions used to detect attacks toward both applications whose functionality requires a web service and the service itself. The tool provides a variety of default vulnerability signatures to scan for and outputs resulting detections
    corecore