16 research outputs found

    An Interactive DSS Tool for Physical Database Design

    Get PDF
    The design of efficient physical databases is a complex activity involving the consideration of a large number of factors. Because of the complexity, mathematical programming approaches seeking to optimize the physical database have to make many simplifying assumptions; therefore, their applicability is limited. Further, the database designer may want to experiment with design preferences and features not considered by the mathematical optimization approaches. In order to effectively design the physical database, this article describes an interactive DSS tool, which aides the database designer in this task. The database design is accomplished in the context of a high-level abstract model which is capable of being implemented in a variety of DBMSs and file systems. Because of this generic nature of the abstract model, the utility of the DSS tool is enhanced. The interactive tool not only lets the designer experiment with his own designs, but also provides several heuristic optimization procedures to enable the generation of many good designs. The heuristic designs may be used for final physical database design as well as for further experimentation. The paper also includes examples of how the physical design selected using the abstract model and the interactive tool may be implemented on several DBMSs and file systems

    Heuristic Optimization of Physical Data Bases: Using a Generic and Abstract Design Model

    Get PDF
    Designing efficient physical data bases is a complex activity, involving the consideration of a large number of factors. Mathematical programming-based optimization models for physical design make many simplifying assumptions; thus, their applicability is limited. In this article, we show that heuristic algorithms can be successfully used in the development of very good, physical data base designs. Two heuristic optimization algorithms are proposed in the contest of a genetic and abstract model for physical design. One algorithm is based on generic principles of heuristic optimization. The other is based on capturing and using problem-specific information in the heuristics. The goodness of the algorithms is demonstrated over a wide range of problems and factor values

    Heuristic Optimization of Physical Databases; Using a Generic & Abstract Design Model

    Get PDF
    Abstract: Designing efficient physical data bases is a complex activity, involving the consideration of a large number of factors. Mathematical programming-based optimization models for physical design make many simplifying assumptions; thus, their applicability is limited. In this article, we show that heuristic algorithms can be successfully used in the development of very good, physical data base designs. Two heuristic optimization algorithms are proposed in the contest of a genetic and abstract model for physical design. One algorithm is based on generic principles of heuristic optimization. The other is based on capturing and using problemspecific information in the heuristics. The goodness of the algorithms is demonstrated over a wide range of problems and factor values. Subject Areas: Heuristics, Management Information Systems, and Simulation. Article: INTRODUCTION Data base design is a challenging activity and involves two phases: logical and physical design. Logical design involves the development of a logical data structure (LDS) for the task domain. Physical design is concerned with developing structures for placing data on secondary storage, given a specific LDS. While both phases require significant effort, the physical data base design phase is the focus of this article. The main concern in physical data base design is the efficiency of the physical design, which can be measured in a number of ways, such as storage requirement, response time and total system cost. Prior works have generally dealt with and developed design/optimization models for specific aspects of physical data base design. These works include index selection While the attention to individual design problems results in elegant solutions, it is quite plausible that those individual solutions will have to be perturbed when the total data base is put together. [20, p. 217] Thus, the need for comprehensive design models (which deal with the entire data base design problem rather than parts of it) cannot be overstated. Comprehensive physical design models fall under two categories: (1) those specific to and (2) those generic and independent of any particular logical data model and/or commercial DBMS. The first category offers the advantage of direct implementation on a particular DBMS, but is difficult to convert to another implementation. The second category allows more flexibility at the expense of added conversion requirements for a particular implementation. Optimization of physical design in the first category is reported i

    IMPLEMENTING A DOMAIN MODEL FOR DATA STRUCTURES

    Full text link

    Complex adaptive systems based data integration : theory and applications

    Get PDF
    Data Definition Languages (DDLs) have been created and used to represent data in programming languages and in database dictionaries. This representation includes descriptions in the form of data fields and relations in the form of a hierarchy, with the common exception of relational databases where relations are flat. Network computing created an environment that enables relatively easy and inexpensive exchange of data. What followed was the creation of new DDLs claiming better support for automatic data integration. It is uncertain from the literature if any real progress has been made toward achieving an ideal state or limit condition of automatic data integration. This research asserts that difficulties in accomplishing integration are indicative of socio-cultural systems in general and are caused by some measurable attributes common in DDLs. This research’s main contributions are: (1) a theory of data integration requirements to fully support automatic data integration from autonomous heterogeneous data sources; (2) the identification of measurable related abstract attributes (Variety, Tension, and Entropy); (3) the development of tools to measure them. The research uses a multi-theoretic lens to define and articulate these attributes and their measurements. The proposed theory is founded on the Law of Requisite Variety, Information Theory, Complex Adaptive Systems (CAS) theory, Sowa’s Meaning Preservation framework and Zipf distributions of words and meanings. Using the theory, the attributes, and their measures, this research proposes a framework for objectively evaluating the suitability of any data definition language with respect to degrees of automatic data integration. This research uses thirteen data structures constructed with various DDLs from the 1960\u27s to date. No DDL examined (and therefore no DDL similar to those examined) is designed to satisfy the law of requisite variety. No DDL examined is designed to support CAS evolutionary processes that could result in fully automated integration of heterogeneous data sources. There is no significant difference in measures of Variety, Tension, and Entropy among DDLs investigated in this research. A direction to overcome the common limitations discovered in this research is suggested and tested by proposing GlossoMote, a theoretical mathematically sound description language that satisfies the data integration theory requirements. The DDL, named GlossoMote, is not merely a new syntax, it is a drastic departure from existing DDL constructs. The feasibility of the approach is demonstrated with a small scale experiment and evaluated using the proposed assessment framework and other means. The promising results require additional research to evaluate GlossoMote’s approach commercial use potential

    Workshop on Database Programming Languages

    Get PDF
    These are the revised proceedings of the Workshop on Database Programming Languages held at Roscoff, Finistère, France in September of 1987. The last few years have seen an enormous activity in the development of new programming languages and new programming environments for databases. The purpose of the workshop was to bring together researchers from both databases and programming languages to discuss recent developments in the two areas in the hope of overcoming some of the obstacles that appear to prevent the construction of a uniform database programming environment. The workshop, which follows a previous workshop held in Appin, Scotland in 1985, was extremely successful. The organizers were delighted with both the quality and volume of the submissions for this meeting, and it was regrettable that more papers could not be accepted. Both the stimulating discussions and the excellent food and scenery of the Brittany coast made the meeting thoroughly enjoyable. There were three main foci for this workshop: the type systems suitable for databases (especially object-oriented and complex-object databases,) the representation and manipulation of persistent structures, and extensions to deductive databases that allow for more general and flexible programming. Many of the papers describe recent results, or work in progress, and are indicative of the latest research trends in database programming languages. The organizers are extremely grateful for the financial support given by CRAI (Italy), Altaïr (France) and AT&T (USA). We would also like to acknowledge the organizational help provided by Florence Deshors, Hélène Gans and Pauline Turcaud of Altaïr, and by Karen Carter of the University of Pennsylvania

    File System Simulation: Hierarchical Performance Measurement and Modeling

    Get PDF
    File systems are very important components in a computer system. File system simulation can help to predict the performance of new system designs. It offers the advantages of the flexibility of modeling and the cost and time savings of utilizing simulation instead of full implementation. Being able to predict end-to-end file system performance against a pre-defined workload can help system designers to make decisions that could affect their entire product line, involving several million dollars of investment. This dissertation presents detailed simulation-based performance models of the Linux ext3 file system and the PVFS parallel file system. The models are developed using Colored Petri Nets. A performance study, using the models, shows that the obtained results are close to the expected behavior of the real file system. The model shows that file system parameters have significant impact on the performance of the I/O when compared to the parameters of the disk subsystem

    Flexibility in Data Management

    Get PDF
    With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators. This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology
    corecore