92 research outputs found

    The representation and management of evolving features in geospatial databases

    Get PDF
    Geographic features change over time, this change being the result of some kind of event or occurrence. It has been a research challenge to represent this data in a manner that reflects human perception. Most database systems used in geographic information systems (GIS) are relational, and change is either captured by exhaustively storing all versions of data, or updates replace previous versions. This stems from the inherent diffculty of modelling geographic objects in relational tables. This diffculty is compounded when the necessary time dimension is introduced to model how those objects evolve. There is little doubt that the object-oriented (OO) paradigm holds signi cant advantages over the relational model when it comes to modelling real-world entities and spatial data, and it is argued that this contention is particularly true when it comes to spatio-temporal data. This thesis describes an object-oriented approach to the design of a conceptual model for representing spatio-temporal geographic data, called the Feature Evolution Model (FEM), based on states and events. The model was used to implement a spatio-temporal database management system in Oracle Spatial, and an interface prototype is described that was used to evaluate the system by enabling querying and visualisation

    Efficient Management of Short-Lived Data

    Full text link
    Motivated by the increasing prominence of loosely-coupled systems, such as mobile and sensor networks, which are characterised by intermittent connectivity and volatile data, we study the tagging of data with so-called expiration times. More specifically, when data are inserted into a database, they may be tagged with time values indicating when they expire, i.e., when they are regarded as stale or invalid and thus are no longer considered part of the database. In a number of applications, expiration times are known and can be assigned at insertion time. We present data structures and algorithms for online management of data tagged with expiration times. The algorithms are based on fully functional, persistent treaps, which are a combination of binary search trees with respect to a primary attribute and heaps with respect to a secondary attribute. The primary attribute implements primary keys, and the secondary attribute stores expiration times in a minimum heap, thus keeping a priority queue of tuples to expire. A detailed and comprehensive experimental study demonstrates the well-behavedness and scalability of the approach as well as its efficiency with respect to a number of competitors.Comment: switched to TimeCenter latex styl

    Attribute-Level Versioning: A Relational Mechanism for Version Storage and Retrieval

    Get PDF
    Data analysts today have at their disposal a seemingly endless supply of data and repositories hence, datasets from which to draw. New datasets become available daily thus making the choice of which dataset to use difficult. Furthermore, traditional data analysis has been conducted using structured data repositories such as relational database management systems (RDBMS). These systems, by their nature and design, prohibit duplication for indexed collections forcing analysts to choose one value for each of the available attributes for an item in the collection. Often analysts discover two or more datasets with information about the same entity. When combining this data and transforming it into a form that is usable in an RDBMS, analysts are forced to deconflict the collisions and choose a single value for each duplicated attribute containing differing values. This deconfliction is the source of a considerable amount of guesswork and speculation on the part of the analyst in the absence of professional intuition. One must consider what is lost by discarding those alternative values. Are there relationships between the conflicting datasets that have meaning? Is each dataset presenting a different and valid view of the entity or are the alternate values erroneous? If so, which values are erroneous? Is there a historical significance of the variances? The analysis of modern datasets requires the use of specialized algorithms and storage and retrieval mechanisms to identify, deconflict, and assimilate variances of attributes for each entity encountered. These variances, or versions of attribute values, contribute meaning to the evolution and analysis of the entity and its relationship to other entities. A new, distinct storage and retrieval mechanism will enable analysts to efficiently store, analyze, and retrieve the attribute versions without unnecessary complexity or additional alterations of the original or derived dataset schemas. This paper presents technologies and innovations that assist data analysts in discovering meaning within their data and preserving all of the original data for every entity in the RDBMS

    Accessing multiversion data in database transactions

    Get PDF
    Many important database applications need to access previous versions of the data set, thus requiring that the data are stored in a multiversion database and indexed with a multiversion index, such as the multiversion B+-tree (MVBT) of Becker et al. The MVBT is optimal, so that any version of the database can be accessed as efficiently as with a single-version B+-tree that is used to index only the data items of that version, but it cannot be used in a full-fledged database system because it follows a single-update model, and the update cannot be rolled back. We have redesigned the MVBT index so that a single multi-action updating transaction can operate on the index structure concurrently with multiple concurrent read-only transactions. Data items created by the transaction become part of the same version, and the transaction can roll back. We call this structure the transactional MVBT (TMVBT). The TMVBT index remains optimal even in the presence of logical key deletions. Even though deletions in a multiversion index must not physically delete the history of the data items, queries and range scans can become more efficient, if the leaf pages of the index structure are merged to retain optimality. For the general transactional setting with multiple updating transactions, we propose a multiversion database structure called the concurrent MVBT (CMVBT), which stores the updates of active transactions in a separate main-memory-resident versioned B+-tree index. A system maintenance transaction is periodically run to apply the updates of committed transactions into the TMVBT index. We show how multiple updating transactions can operate on the CMVBT index concurrently, and our recovery algorithm is based on the standard ARIES recovery algorithm. We prove that the TMVBT index is asymptotically optimal, and show that the performance of the CMVBT index in general transaction processing is on par with the performance of the time-split B+-tree (TSB-tree) of Lomet and Salzberg. The TSB-tree does not merge leaf pages and is therefore not optimal if logical data-item deletions are allowed. Our experiments show that the CMVBT outperforms the TSB-tree with range queries in the presence of deletions

    Towards a big data reference architecture

    Get PDF

    Rover-II: A Context-Aware Middleware for Pervasive Computing Environment

    Get PDF
    It is well recognized that context plays a significant role in all human endeavors. All decisions are based on information which has to be interpreted in context. By making information systems context-aware we can have systems that significantly enhance human capabilities to make critical decisions. A major challenge of context-aware systems is to balance usability with generality and extensibility. The relevant context changes depending on the particular application. The model used to represent the context and its relationship to entities must be general enough to allow additions of context categories without redesign while remaining usable across many applications. Also, while efforts are put in by application designers and developers to make applications context-aware, these efforts are customized to specific needs of the target application, and only certain common contexts like location and time are taken into account. Therefore, a general framework is called for that can (i) efficiently maintain, represent and integrate contextual information, (ii) act as an integration platform where different applications can share contexts and (iii) provide relevant services to make efficient use of the contextual information. This dissertation presents: * a generic and effective context model - Rover Context Model (RoCoM) that is structured around four primitives: entities, events, relationships, and activities; and practically usable through the concept of templates, * a flexible, extensible and generic ontology - Rover Context Model Ontology (RoCoMO) supporting the model, that addresses the shortcomings of existing ontologies, * an effective mechanism of modeling the context of a situation, through the concept of relevant context, with the help of situation graph, efficiently handling and making best use of context information, * a context middleware - Rover-II, which serves as a framework for contextual information integration, that could be used not just to store and compile the contextual information, but also integrate relevant services to enhance the context information; and more importantly, enable sharing of context among the applications subscribed to it, * the initial design and implementation of a distributed architecture for Rover-II, following a P2P arrangement inspired from Tapestry, The above concepts are illustrated through M-Urgency, a context-aware public safety system that has been deployed at the University of Maryland Police Department

    Query processing in temporal object-oriented databases

    Get PDF
    This PhD thesis is concerned with historical data management in the context of objectoriented databases. An extensible approach has been explored to processing temporal object queries within a uniform query framework. By the uniform framework, we mean temporal queries can be processed within the existing object-oriented framework that is extended from relational framework, by extending the existing query processing techniques and strategies developed for OODBs and RDBs. The unified model of OODBs and RDBs in UmSQL/X has been adopted as a basis for this purpose. A temporal object data model is thereby defined by incorporating a time dimension into this unified model of OODBs and RDBs to form temporal relational-like cubes but with the addition of aggregation and inheritance hierarchies. A query algebra, that accesses objects through these associations of aggregation, inheritance and timereference, is then defined as a general query model /language. Due to the extensive features of our data model and reducibility of the algebra, a layered structure of query processor is presented that provides a uniforrn framework for processing temporal object queries. Within the uniform framework, query transformation is carried out based on a set of transformation rules identified that includes the known relational and object rules plus those pertaining to the time dimension. To evaluate a temporal query involving a path with timereference, a strategy of decomposition is proposed. That is, evaluation of an enhanced path, which is defined to extend a path with time-reference, is decomposed by initially dividing the path into two sub-paths: one containing the time-stamped class that can be optimized by making use of the ordering information of temporal data and another an ordinary sub-path (without time-stamped classes) which can be further decomposed and evaluated using different algorithms. The intermediate results of traversing the two sub-paths are then joined together to create the query output. Algorithms for processing the decomposed query components, i. e., time-related operation algorithms, four join algorithms (nested-loop forward join, sort-merge forward join, nested-loop reverse join and sort-merge reverse join) and their modifications, have been presented with cost analysis and implemented with stream processing techniques using C++. Simulation results are also provided. Both cost analysis and simulation show the effects of time on the query processing algorithms: the join time cost is linearly increased with the expansion in the number of time-epochs (time-dimension in the case of a regular TS). It is also shown that using heuristics that make use of time information can lead to a significant time cost saving. Query processing with incomplete temporal data has also been discussed
    • …
    corecore