Search CORE

19 research outputs found

Database interfaces on NASA's heterogeneous distributed database system

Author: Huang Shou-Hsuan Stephen
Publication venue
Publication date
Field of study

The purpose of Distributed Access View Integrated Database (DAVID) interface module (Module 9: Resident Primitive Processing Package) is to provide data transfer between local DAVID systems and resident Data Base Management Systems (DBMSs). The result of current research is summarized. A detailed description of the interface module is provided. Several Pascal templates were constructed. The Resident Processor program was also developed. Even though it is designed for the Pascal templates, it can be modified for templates in other languages, such as C, without much difficulty. The Resident Processor itself can be written in any programming language. Since Module 5 routines are not ready yet, there is no way to test the interface module. However, simulation shows that the data base access programs produced by the Resident Processor do work according to the specifications

NASA Technical Reports Server

IDEF3 formalization report

Author: Edwards Douglas D.
Mayer Richard J.
Menzel Christopher
Publication venue
Publication date
Field of study

The Process Description Capture Method (IDEF3) is one of several Integrated Computer-Aided Manufacturing (ICAM) DEFinition methods developed by the Air Force to support systems engineering activities, and in particular, to support information systems development. These methods have evolved as a distillation of 'good practice' experience by information system developers and are designed to raise the performance level of the novice practitioner to one comparable with that of an expert. IDEF3 is meant to serve as a knowledge acquisition and requirements definition tool that structures the user's understanding of how a given process, event, or system works around process descriptions. A special purpose graphical language accompanying the method serves to highlight temporal precedence and causality relationships relative to the process or event being described

NASA Technical Reports Server

A comparison between algebraic query languages for flat and nested databases

Author: Gyssens Marc
Van Gucht Dirk
Publication venue: Published by Elsevier B.V.
Publication date: 23/09/1991
Field of study

AbstractRecently, much attention has been paid to query languages for nested relations. In the present paper, we consider the nested algebra and the powerset algebra, and compare them both mutually as well as to the traditional flat algebra. We show that either nest or difference can be removed as a primitive operator in the powerset algebra. While the redundancy of the nest operator might have been expected, the same cannot be said of the difference. Basically, this result shows that the presence of one nonmonotonic operator suffices in the powerset algebra. As an interesting consequence of this result, the nested algebra without the difference remains complete in the sense of Bancilhon and Paredaens. Finally, we show there are both similarities and fundamental differences between the expressiveness of query languages for nested relations and that of their counterparts for flat relations

Elsevier - Publisher Connector

Development and Applications of Similarity Measures for Spatial-Temporal Event and Setting Sequences

Author: Xu Fuyu
Publication venue: DigitalCommons@UMaine
Publication date: 05/05/2023
Field of study

Similarity or distance measures between data objects are applied frequently in many fields or domains such as geography, environmental science, biology, economics, computer science, linguistics, logic, business analytics, and statistics, among others. One area where similarity measures are particularly important is in the analysis of spatiotemporal event sequences and associated environs or settings. This dissertation focuses on developing a framework of modeling, representation, and new similarity measure construction for sequences of spatiotemporal events and corresponding settings, which can be applied to different event data types and used in different areas of data science. The first core part of this dissertation presents a matrix-based spatiotemporal event sequence representation that unifies punctual and interval-based representation of events. This framework supports different event data types and provides support for data mining and sequence classification and clustering. The similarity measure is based on the modified Jaccard index with temporal order constraints and accommodates different event data types. This approach is demonstrated through simulated data examples and the performance of the similarity measures is evaluated with a k-nearest neighbor algorithm (k-NN) classification test on synthetic datasets. These similarity measures are incorporated into a clustering method and successfully demonstrate the usefulness in a case study analysis of event sequences extracted from space time series of a water quality monitoring system. This dissertation further proposes a new similarity measure for event setting sequences, which involve the space and time in which events occur. While similarity measures for spatiotemporal event sequences have been studied, the settings and setting sequences have not yet been considered. While modeling event setting sequences, spatial and temporal scales are considered to define the bounds of the setting and incorporate dynamic variables along with static variables. Using a matrix-based representation and an extended Jaccard index, new similarity measures are developed to allow for the use of all variable data types. With these similarity measures coupled with other multivariate statistical analysis approaches, results from a case study involving setting sequences and pollution event sequences associated with the same monitoring stations, support the hypothesis that more similar spatial-temporal settings or setting sequences may generate more similar events or event sequences. To test the scalability of STES similarity measure in a larger dataset and an extended application in different fields, this dissertation compares and contrasts the prospective space-time scan statistic with the STES similarity approach for identifying COVID-19 hotspots. The COVID-19 pandemic has highlighted the importance of detecting hotspots or clusters of COVID-19 to provide decision makers at various levels with better information for managing distribution of human and technical resources as the outbreak in the USA continues to grow. The prospective space-time scan statistic has been used to help identify emerging disease clusters yet results from this approach can encounter strategic limitations imposed by the spatial constraints of the scanning window. The STES-based approach adapted for this pandemic context computes the similarity of evolving normalized COVID-19 daily cases by county and clusters these to identify counties with similarly evolving COVID-19 case histories. This dissertation analyzes the spread of COVID-19 within the continental US through four periods beginning from late January 2020 using the COVID-19 datasets maintained by John Hopkins University, Center for Systems Science and Engineering (CSSE). Results of the two approaches can complement with each other and taken together can aid in tracking the progression of the pandemic. Overall, the dissertation highlights the importance of developing similarity measures for analyzing spatiotemporal event sequences and associated settings, which can be applied to different event data types and used for data mining, sequence classification, and clustering

University of Maine

Logical optimization for database uniformization

Author: Grant J.
Publication venue
Publication date
Field of study

Data base uniformization refers to the building of a common user interface facility to support uniform access to any or all of a collection of distributed heterogeneous data bases. Such a system should enable a user, situated anywhere along a set of distributed data bases, to access all of the information in the data bases without having to learn the various data manipulation languages. Furthermore, such a system should leave intact the component data bases, and in particular, their already existing software. A survey of various aspects of the data bases uniformization problem and a proposed solution are presented

NASA Technical Reports Server

Axiomatic Specification of Database Domain Statics

Author: Wieringa Roel
Publication venue: Free University, Department of Mathematics and Computer Science
Publication date: 01/01/1987
Field of study

In the past ten years, much work has been done to add more structure to database models 1 than what is represented by a mere collection of flat relations (Albano & Cardelli [1985], Albano et al. [1986], Borgida eta. [1984], Brodie [1984], Brodie & Ridjanovic [1984], Brodie & Silva (1982], Codd (1979], Hammer & McLeod (1981], King (1984], King & McLeod [1984], [1985], Mylopoulos et al. [1980], Smith & Smith 1977a & b). 2 The informal approach which most of these studies advocate has a number of disadvantages. First, a recent survey of some of the pro posed models by Urban & Delcambre [1986] reveals a wide divergence in terminology and con cepts, making comparison of the expressive power of these models difficult. Second, undefined or even ill-defined concepts are a hindrance, not an aid, for the analysis of the Universe of Discourse (UoD). Third, informal treatment 9f such complex structures as set hierarchies, gen eralization hierarchies and aggregation hierarchies all in one model, with some dynamics thrown in for good measure, bodes ill for the consistency of these theories. The first goal of the research reported on is to integrate the static structures which these models propose in one coherent, axiomatic framework. It will be shown in chapter 7 that the theory presented here provides the needed conceptual foundations for these models. A second aim is to provide a possible worlds framework onto which to graft theories of the dynamics of the UoD. The third aim is to provide clear concepts which can aid the database model designer in his or her thinking about the UoD. In this report we concentrate on the first goal only, leav ing the formulation of theories of domain dynamics and the application to system development as research goals for the near future

University of Twente Research Information

Recommended from our members

Supporting Scientific Analytics under Data Uncertainty and Query Uncertainty

Author: Peng Liping
Publication venue: ScholarWorks@UMass Amherst
Publication date: 23/03/2018
Field of study

Data management is becoming increasingly important in many applications, in particular, in large scientific databases where (1) data can be naturally modeled by continuous random variables, and (2) queries can involve complex predicates and/or be difficult for users to express explicitly. My thesis work aims to provide efficient support to both the data uncertainty and the query uncertainty . When data is uncertain, an important class of queries requires query answers to be returned if their existence probabilities pass a threshold. I start with optimizing such threshold query processing for continuous uncertain data in the relational model by (i) expediting selections by reducing dimensionality of integration and using faster filters, (ii) expediting joins using new indexes on uncertain data, and (iii) optimizing a query plan using a dynamic, per-tuple based approach. Evaluation results using real-world data and benchmark queries show the accuracy and efficiency of my techniques and the dynamic query planning has over 50% performance gains in most cases over a state-of-the-art threshold query optimizer and is very close to the optimal planning in all cases. Next I address uncertain data management in the array model, which has gained popu- larity for scientific data processing recently due to performance benefits. I define the formal semantics of array operations on uncertain data involving both value uncertainty within individual tuples and position uncertainty regarding where a tuple should belong in an array given uncertain dimension attributes, and propose a suite of storage and evaluation strategies for array operators, with a focus on a novel scheme that bounds the overhead of querying by strategically placing a few replicas of the tuples with large variances. Evaluation results show that for common workloads, my best-performing techniques outperform baselines up to 1 to 2 orders of magnitude while incurring only small storage overhead. Finally, to bridge the increasing gap between the fast growth of data and the limited human ability to comprehend data and help the user retrieve high-value content from data more effectively, I propose to build interactive data exploration as a new database service, using an approach called “explore-by-example”. To build an effective system, my work is grounded in a rigorous SVM-based active learning framework and focuses on the following three problems: (i) accuracy-based and convergence-based stopping criteria, (ii) expediting example acquisition in each iteration, and (iii) expediting the final result retrieval. Evaluation results using real-world data and query patterns show that my system significantly outperforms state-of-the-art systems in accuracy (18x accuracy improvement for 4-dimensional workloads) while achieving desired efficiency for interactive exploration (2 to 5 seconds per iteration)

ScholarWorks@UMass Amherst

Database system architecture supporting coexisting query languages and data models

Author: Hepp Pedro E.
Publication venue: The University of Edinburgh
Publication date: 01/01/1983
Field of study

SIGLELD:D48239/84 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

Edinburgh Research Archive

OpenGrey Repository

Interactive Exploration of Temporal Event Sequences

Author: Wongsuphasawat Krist
Publication venue
Publication date: 01/01/2012
Field of study

Life can often be described as a series of events. These events contain rich information that, when put together, can reveal history, expose facts, or lead to discoveries. Therefore, many leading organizations are increasingly collecting databases of event sequences: Electronic Medical Records (EMRs), transportation incident logs, student progress reports, web logs, sports logs, etc. Heavy investments were made in data collection and storage, but difficulties still arise when it comes to making use of the collected data. Analyzing millions of event sequences is a non-trivial task that is gaining more attention and requires better support due to its complex nature. Therefore, I aimed to use information visualization techniques to support exploratory data analysis---an approach to analyzing data to formulate hypotheses worth testing---for event sequences. By working with the domain experts who were analyzing event sequences, I identified two important scenarios that guided my dissertation: First, I explored how to provide an overview of multiple event sequences? Lengthy reports often have an executive summary to provide an overview of the report. Unfortunately, there was no executive summary to provide an overview for event sequences. Therefore, I designed LifeFlow, a compact overview visualization that summarizes multiple event sequences, and interaction techniques that supports users' exploration. Second, I examined how to support users in querying for event sequences when they are uncertain about what they are looking for. To support this task, I developed similarity measures (the M&M measure 1-2) and user interfaces (Similan 1-2) for querying event sequences based on similarity, allowing users to search for event sequences that are similar to the query. After that, I ran a controlled experiment comparing exact match and similarity search interfaces, and learned the advantages and disadvantages of both interfaces. These lessons learned inspired me to develop Flexible Temporal Search (FTS) that combines the benefits of both interfaces. FTS gives confident and countable results, and also ranks results by similarity. I continued to work with domain experts as partners, getting them involved in the iterative design, and constantly using their feedback to guide my research directions. As the research progressed, several short-term user studies were conducted to evaluate particular features of the user interfaces. Both quantitative and qualitative results were reported. To address the limitations of short-term evaluations, I included several multi-dimensional in-depth long-term case studies with domain experts in various fields to evaluate deeper benefits, validate generalizability of the ideas, and demonstrate practicability of this research in non-laboratory environments. The experience from these long-term studies was combined into a set of design guidelines for temporal event sequence exploration. My contributions from this research are LifeFlow, a visualization that compactly displays summaries of multiple event sequences, along with interaction techniques for users' explorations; similarity measures (the M&M measure 1-2) and similarity search interfaces (Similan 1-2) for querying event sequences; Flexible Temporal Search (FTS), a hybrid query approach that combines the benefits of exact match and similarity search; and case study evaluations that results in a process model and a set of design guidelines for temporal event sequence exploration. Finally, this research has revealed new directions for exploring event sequences

Digital Repository at the University of Maryland