29,547 research outputs found

    A Framework for Records Management in Relational Database Systems

    Get PDF
    The problem of records retention is often viewed as simply deleting records when they have outlived their purpose. However, in the world of relational databases there is no standardized notion of a business record and its retention obligations. Unlike physical documents such as forms and reports, information in databases is organized such that one item of data may be part of various legal records and consequently subject to several (and possibly conflicting) retention policies. This thesis proposes a framework for records retention in relational database systems. It presents a mechanism through which users can specify a broad range of protective and destructive data retention policies for relational records. Compared to naïve solutions for enforcing records management policies, our framework is not only significantly more efficient but it also addresses several unanswered questions about how policies can be mapped from given legal requirements to actions on relational data. The novelty in our approach is that we defined a record in a relational database as an arbitrary logical view, effectively allowing us to reduce several challenges in enforcing data retention policies to well-studied problems in database theory. We argue that our expression based approach of tracking records management obligations is not only easier for records managers to use but also far more space/time efficient compared to traditional metadata approaches discussed in the literature. The thesis concludes with a thorough examination of the limitations of the proposed framework and suggestion for future research in the area of records management for relational database management systems

    Towards an automatic data value analysis method for relational databases

    Get PDF
    Data is becoming one of the world’s most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward some simple but engaging survey-based methods to help with the first-level data assessment in an organisation. However, these methods are manual and they depend on the costly input of domain experts. In this paper, we propose to extend the manual survey-based approaches with additional metrics and dimensions derived from the evolving literature on data value dimensions and tailored specifically for our use case study. We also developed an automatic, metric-based data value assessment approach that (i) automatically quantifies the business value of data in Relational Databases (RDB), and (ii) provides a scoring method that facilitates the ranking and extraction of the most valuable RDB tables. We evaluate our proposed approach on a real-world RDB database from a small online retailer (MyVolts) and show in our experimental study that the data value assessments made by our automated system match those expressed by the domain expert approach

    Real-Time Data Processing With Lambda Architecture

    Get PDF
    Data has evolved immensely in recent years, in type, volume and velocity. There are several frameworks to handle the big data applications. The project focuses on the Lambda Architecture proposed by Marz and its application to obtain real-time data processing. The architecture is a solution that unites the benefits of the batch and stream processing techniques. Data can be historically processed with high precision and involved algorithms without loss of short-term information, alerts and insights. Lambda Architecture has an ability to serve a wide range of use cases and workloads that withstands hardware and human mistakes. The layered architecture enhances loose coupling and flexibility in the system. This a huge benefit that allows understanding the trade-offs and application of various tools and technologies across the layers. There has been an advancement in the approach of building the LA due to improvements in the underlying tools. The project demonstrates a simplified architecture for the LA that is maintainable

    A new model to support the personalised management of a quality e-commerce service

    No full text
    The paper presents an aiding model to support the management of a high quality e-commerce service. The approach focuses on the service quality aspects related to customer relationship management (CRM). Knowing the individual characteristics of a customer, it is possible to supply a personalised and high quality service. A segmentation model, based on the "relationship evolution" between users and Web site, is developed. The method permits the provision of a specific service management for each user segment. Finally, some preliminary experimental results for a sport-clothing industry application are described

    Impliance: A Next Generation Information Management Appliance

    Full text link
    ably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today's requirements and hardware capabilities, would it look anything like today's database systems?" In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data - unstructured as well as structured - in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

    Closing the Gap: How Improving Information Flow Can Help Community-Based Organizations Keep Uninsured Kids From Falling Through the Cracks

    Get PDF
    Evaluates how community-based organizations used a tool for systematic, ongoing data exchange with the state to monitor children's enrollment and redetermination status in public health insurance. Explores its potential to boost outreach and enrollment

    Creating a Relational Distributed Object Store

    Full text link
    In and of itself, data storage has apparent business utility. But when we can convert data to information, the utility of stored data increases dramatically. It is the layering of relation atop the data mass that is the engine for such conversion. Frank relation amongst discrete objects sporadically ingested is rare, making the process of synthesizing such relation all the more challenging, but the challenge must be met if we are ever to see an equivalent business value for unstructured data as we already have with structured data. This paper describes a novel construct, referred to as a relational distributed object store (RDOS), that seeks to solve the twin problems of how to persistently and reliably store petabytes of unstructured data while simultaneously creating and persisting relations amongst billions of objects.Comment: 12 pages, 5 figure

    FISA Reform

    Get PDF
    Congress and the Executive Branch are poised to take up the issue of FISA reform in 2014. What has been missing from the discussion is a comprehensive view of ways in which reform could be given effect—i.e., a taxonomy of potential options. This article seeks to fill the gap. The aim is to deepen the conversation about abeyant approaches to foreign intelligence gathering, to allow fuller discussion of what a comprehensive package could contain, and to place initiatives that are currently under consideration within a broader, over-arching framework. The article begins by considering the legal underpinnings and challenges to the President\u27s Surveillance Program. It then examines how technology has altered the types of information available, as well as methods of transmission and storage. The article builds on this to develop a taxonomy for how a statutory approach to foreign intelligence gathering could be given force. It divides foreign intelligence gathering into two categories: front-end collection and back-end analysis and use. Each category contains a counterpoise structured to ensure the appropriate exercise of Congressionally-mandated authorities. For the front-end, this means balancing the manner of collection with requirements for approval. For the back-end, this means offsetting implementation with transparency and oversight. The article then considers the constituent parts of each category
    corecore