2 research outputs found

    A User-driven Annotation Framework for Scientific Data

    Get PDF
    Annotations play an increasingly crucial role in scientific exploration and discovery, as the amount of data and the level of collaboration among scientists increases. There are many systems today focusing on annotation management, querying, and propagation. Although all such systems are implemented to take user input (i.e., the annotations themselves), very few systems are user-driven, taking into account user preferences on how annotations should be propagated and applied over data. In this thesis, we propose to treat annotations as first-class citizens for scientific data by introducing a user-driven, view-based annotation framework. Under this framework, we try to resolve two critical questions: Firstly, how do we support annotations that are scalable both from a system point of view and also from a user point of view? Secondly, how do we support annotation queries both from an annotator point of view and a user point of view, in an efficient and accurate way? To address these challenges, we propose the VIew-base annotation Propagation (ViP) framework to empower users to express their preferences over the time semantics of annotations and over the network semantics of annotations, and define three query types for annotations. To efficiently support such novel functionality, ViP utilizes database views and introduces new annotation caching techniques. The use of views also brings a more compact representation of annotations, making our system easier to scale. Through an extensive experimental study on a real system (with both synthetic and real data), we show that the ViP framework can seamlessly introduce user-driven annotation propagation semantics while at the same time significantly improving the performance (in terms of query execution time) over the current state of the art

    Supporting Real-world Activities in Database Management Systems

    Get PDF
    The cycle of processing the data in many application domains is complex and may involve real-world activities that are external to the database, e.g., wet-lab experiments, instrument readings, and manual measurements. These real-world activities may take long time to prepare for and to perform, and hence introduce inherently long time delays between the updates in the database. The presence of these long delays between the updates, along with the need for the intermediate results to be instantly available, makes supporting real-world activities in the database engine a challenging task. In this paper, we address these challenges through a system that enables users to reflect their updates immediately into the database while keeping track of the dependent and potentially invalid data items until they are re-validated. The proposed system includes: (1) semantics and syntax for interfaces through which users can express the dependencies among data items, (2) new operators to alert users when the returned query results contain potentially invalid or out-of-date data, and to enable evaluating queries on either valid data only, or both valid and potentially invalid data, and (3) mechanisms for data invalidation and revalidation. The proposed system is being realized via extensions to PostgreSQL
    corecore