578 research outputs found

    Towards Exascale Scientific Metadata Management

    Full text link
    Advances in technology and computing hardware are enabling scientists from all areas of science to produce massive amounts of data using large-scale simulations or observational facilities. In this era of data deluge, effective coordination between the data production and the analysis phases hinges on the availability of metadata that describe the scientific datasets. Existing workflow engines have been capturing a limited form of metadata to provide provenance information about the identity and lineage of the data. However, much of the data produced by simulations, experiments, and analyses still need to be annotated manually in an ad hoc manner by domain scientists. Systematic and transparent acquisition of rich metadata becomes a crucial prerequisite to sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and domain-agnostic metadata management infrastructure that can meet the demands of extreme-scale science is notable by its absence. To address this gap in scientific data management research and practice, we present our vision for an integrated approach that (1) automatically captures and manipulates information-rich metadata while the data is being produced or analyzed and (2) stores metadata within each dataset to permeate metadata-oblivious processes and to query metadata through established and standardized data access interfaces. We motivate the need for the proposed integrated approach using applications from plasma physics, climate modeling and neuroscience, and then discuss research challenges and possible solutions

    Consistency in 3D

    Get PDF
    Comparisons of different consistency models often try to place them in a linear strong-to-weak order. However this view is clearly inadequate, since it is well known, for instance, that Snapshot Isolation and Serialisability are incomparable. In the interest of a better understanding, we propose a new classification, along three dimensions, related to: a total order of writes, a causal order of reads, and transactional composition of multiple operations. A model may be stronger than another on one dimension and weaker on another. We believe that this new classification scheme is both scientifically sound and has good explicative value. The current paper presents the three-dimensional design space intuitively.Les comparaisons entre modèles de la cohérence tentent souvent de les classer dans un ordre linéaire, de faible à forte. Cette vue est clairement inadéquate, puisque il est bien connu que, par exemple, les modèles Snapshot Isolation et Serialisability sont incomparables. Dans l'intérêt d'une meilleure compréhension du domaine, nous proposons une nouvelle classification, en trois dimensions~: les garanties liées à un ordre total des écritures~; celles liées à un ordre causal des lectures~; et celles liées à la composition transactionelle d'opérations multiples. Un modèle peut être plus fort qu'un autre dans une dimension, et moins dans une autre. Nous pensons que ce nouveau schéma de classification, à la fois est scientifiquement valide, et a une bonne valeur explicative. Le présent rapport présente l'espace de conception en trois dimensions de façon intuitive

    Functionally Specified Distributed Transactions in Co-operative Scenarios

    Get PDF
    Addresses the problem of specifying co-operative, distributed transactions in a manner that can be subject to verification and testing. Our approach combines the process-algebraic language LOTOS and the object-oriented database modelling language TM to obtain a clear and formal protocol for distributed database transactions meant to describe co-operation scenarios. We argue that a separation of concerns, namely the interaction of database applications on the one hand and data modelling on the other, results in a practical, modular approach that is formally well-founded. An advantage of this is that we may vary over transaction models to support the language combinatio

    A Transactional Model and Platform for Designing and Implementing Reactive Systems

    Get PDF
    A reactive program is one that has ongoing interactions with its environment. Reactive programs include those for embedded systems, operating systems, network clients and servers, databases, and smart phone apps. Reactive programs are already a core part of our computational and physical infrastructure and will continue to proliferate within our society as new form factors, e.g. wireless sensors, and inexpensive (wireless) networking are applied to new problems. Asynchronous concurrency is a fundamental characteristic of reactive systems that makes them difficult to develop. Threads are commonly used for implementing reactive systems, but they may magnify problems associated with asynchronous concurrency, as there is a gap between the semantics of thread-based computation and the semantics of reactive systems: reactive software developed with threads often has subtle timing bugs and tends to be brittle and non-reusable as a holistic understanding of the software becomes necessary to avoid concurrency hazards such as data races, deadlock, and livelock. Based on these problems with the state of the art, we believe a new model for developing and implementing reactive systems is necessary. This dissertation makes four contributions to the state of the art in reactive systems. First, we propose a formal yet practical model for (asynchronous) reactive systems called reactive components. A reactive component is a set of state variables and atomic transitions that can be composed with other reactive components to yield another reactive component. The transitions in a system of reactive components are executed by a scheduler. The reactive component model is based on concepts from temporal logic and models like UNITY and I/O Automata. The major contribution of the reactive component model is a formal method for principled composition, which ensures that 1) the result of composition is always another reactive component, for consistency of reasoning; 2) systems may be decomposed to an arbitrary degree and depth, to foster divide-and-conquer approaches when designing and re-use when implementing; 3)~the behavior of a reactive component can be stated in terms of its interface, which is necessary for abstraction; and 4) properties of reactive components that are derived from transitions protected by encapsulation are preserved through composition and can never be violated, which permits assume-guarantee reasoning. Second, we develop a prototypical programming language for reactive components called rcgo that is based on the syntax and semantics of the Go programming language. The semantics of the rcgo language enforce various aspects of the reactive component model, e.g., the isolation of state between components and safety of concurrency properties, while permitting a number of useful programming techniques, e.g., reference and move semantics for efficient communication among reactive components. For tractability, we assume that each system contains a fixed set of components in a fixed configuration. Third, we provide an interpreter for the rcgo language to test the practicality of the assumptions upon which the reactive component model are founded. The interpreter contains an algorithm that checks for composition hazards like recursively defined transitions and non-deterministic transitions. Transitions are executed using a novel calling convention that can be implemented efficiently on existing architectures. The run-time system also contains two schedulers that use the results of composition analysis to execute non-interfering transitions concurrently. Fourth, we compare the performance of each scheduler in the interpreter to the performance of a custom compiled multi-threaded program, for two reactive systems. For one system, the combination of the implementation and hardware biases it toward an event-based solution, which was confirmed when the reactive component implementation outperformed the custom implementation due to reduced context switching. For the other system, the custom implementation is not prone to excessive context switches and outperformed the reactive component implementations. These results demonstrate that reactive components may be a viable alternative to threads in practice, but that additional work is necessary to generalize this claim

    State-of-the-art on evolution and reactivity

    Get PDF
    This report starts by, in Chapter 1, outlining aspects of querying and updating resources on the Web and on the Semantic Web, including the development of query and update languages to be carried out within the Rewerse project. From this outline, it becomes clear that several existing research areas and topics are of interest for this work in Rewerse. In the remainder of this report we further present state of the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs; in Chapter 4 event-condition-action rules, both in the context of active database systems and in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks

    Universes for Race Safety

    No full text
    Race conditions occur when two incorrectly synchronised threads simultaneously access the same object. Static type systems have been suggested to prevent them. Typically, they use annotations to determine the relationship between an object and its “guard ” (another object), and to guarantee that the guard has been locked before the object is accessed. The object-guard relationship thus forms a tree similar to an ownership type hierarchy. Universe types are a simple form of ownership types. We explore the use of universe types for static identification of race conditions. We use a small, Java-like language with universe types and concurrency primitives. We give a type system that enforces synchronisation for all object accesses, and prove that race conditions cannot occur during execution of a type correct program. We support references to objects whose ownership domain is unknown. Unlike previous work, we do so without compromising the synchronisation strategy used where the ownership domain of such objects is fully known. We develop a novel technique for dealing with non-final (i.e. mutable) paths to objects of unknown ownership domain using effects

    Towards Implicit Parallel Programming for Systems

    Get PDF
    Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually. In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution. We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
    corecore