38,126 research outputs found
Recommended from our members
Modular and Safe Event-Driven Programming
Asynchronous event-driven systems are ubiquitous across domains such as device drivers, distributed systems, and robotics. These systems are notoriously hard to get right as the programmer needs to reason about numerous control paths resulting from the complex interleaving of events (or messages) and failures. Unsurprisingly, it is easy to introduce subtle errors while attempting to fill in gaps between high-level system specifications and their concrete implementations.This dissertation proposes new methods for programming safe event-driven asynchronous systems.In the first part of the thesis, we present ModP, a modular programming framework for compositional programming and testing of event-driven asynchronous systems.The ModP module system supports a novel theory of compositional refinement for assume-guarantee reasoning of dynamic event-driven asynchronous systems. We build a complex distributed systems software stack using ModP.Our results demonstrate that compositional reasoning can help scale model-checking (both explicit and symbolic) to large distributed systems.ModP is transforming the way asynchronous software is built at Microsoft and Amazon Web Services (AWS). Microsoft uses ModP for implementing safe device drivers and other software in the Windows kernel.AWS uses ModP for compositional model checking of complex distributed systems. While ModP simplifies analysis of such systems, the state space of industrial-scale systems remains extremely large.In the second part of this thesis, we present scalable verification and systematic testing approaches to further mitigate this state-space explosion problem.First, we introduce the concept of a delaying explorer to perform prioritized exploration of the behaviors of an asynchronous reactive program. A delaying explorer stratifies the search space using a custom strategy (tailored towards finding bugs faster), and a delay operation that allows deviation from that strategy. We show that prioritized search with a delaying explorer performs significantly better than existing approaches for finding bugs in asynchronous programs.Next, we consider the challenge of verifying time-synchronized systems; these are almost-synchronous systems as they are neither completely asynchronous nor synchronous.We introduce approximate synchrony, a sound and tunable abstraction for verification of almost-synchronous systems. We show how approximate synchrony can be used for verification of both time-synchronization protocols and applications running on top of them.Moreover, we show how approximate synchrony also provides a useful strategy to guide state-space exploration during model-checking.Using approximate synchrony and implementing it as a delaying explorer, we were able to verify the correctness of the IEEE 1588 distributed time-synchronization protocol and, in the process, uncovered a bug in the protocol that was well appreciated by the standards committee.In the final part of this thesis, we consider the challenge of programming a special class of event-driven asynchronous systems -- safe autonomous robotics systems.Our approach towards achieving assured autonomy for robotics systems consists of two parts: (1) a high-level programming language for implementing and validating the reactive robotics software stack; and (2) an integrated runtime assurance system to ensure that the assumptions used during design-time validation of the high-level software hold at runtime.Combining high-level programming language and model-checking with runtime assurance helps us bridge the gap between design-time software validation that makes assumptions about the untrusted components (e.g., low-level controllers), and the physical world, and the actual execution of the software on a real robotic platform in the physical world. We implemented our approach as DRONA, a programming framework for building safe robotics systems.We used DRONA for building a distributed mobile robotics system and deployed it on real drone platforms. Our results demonstrate that DRONA (with the runtime-assurance capabilities) enables programmers to build an autonomous robotics software stack with formal safety guarantees.To summarize, this thesis contributes new theory and tools to the areas of programming languages, verification, systematic testing, and runtime assurance for programming safe asynchronous event-driven across the domains of fault-tolerant distributed systems and safe autonomous robotics systems
Replica determinism and flexible scheduling in hard real-time dependable systems
Fault-tolerant real-time systems are typically based on active replication where replicated entities are required to deliver their outputs in an identical order within a given time interval. Distributed scheduling of replicated tasks, however, violates this requirement if on-line scheduling, preemptive scheduling, or scheduling of dissimilar replicated task sets is employed. This problem of inconsistent task outputs has been solved previously by coordinating the decisions of the local schedulers such that replicated tasks are executed in an identical order. Global coordination results either in an extremely high communication effort to agree on each schedule decision or in an overly restrictive execution model where on-line scheduling, arbitrary preemptions, and nonidentically replicated task sets are not allowed. To overcome these restrictions, a new method, called timed messages, is introduced. Timed messages guarantee deterministic operation by presenting consistent message versions to the replicated tasks. This approach is based on simulated common knowledge and a sparse time base. Timed messages are very effective since they neither require communication between the local scheduler nor do they restrict usage of on-line flexible scheduling, preemptions and nonidentically replicated task sets
An Analysis of Composability and Composition Anomalies
The separation of concerns principle aims at decomposing a given design problem into concerns that are mapped to multiple independent software modules. The application of this principle eases the composition of the concerns and as such supports composability. Unfortunately, a clean separation (and composition of concerns) at the design level does not always imply the composability of the concerns at the implementation level. The composability might be reduced due to limitations of the implementation abstractions and composition mechanisms. The paper introduces the notion of composition anomaly to describe a general set of unexpected composition problems that arise when mapping design concerns to implementation concerns. To distinguish composition anomalies from other composition problems the requirements for composability at the design level is provided. The ideas are illustrated for a distributed newsgroup system
A Concurrent Perspective on Smart Contracts
In this paper, we explore remarkable similarities between multi-transactional
behaviors of smart contracts in cryptocurrencies such as Ethereum and classical
problems of shared-memory concurrency. We examine two real-world examples from
the Ethereum blockchain and analyzing how they are vulnerable to bugs that are
closely reminiscent to those that often occur in traditional concurrent
programs. We then elaborate on the relation between observable contract
behaviors and well-studied concurrency topics, such as atomicity, interference,
synchronization, and resource ownership. The described
contracts-as-concurrent-objects analogy provides deeper understanding of
potential threats for smart contracts, indicate better engineering practices,
and enable applications of existing state-of-the-art formal verification
techniques.Comment: 15 page
Teaching Concurrent Software Design: A Case Study Using Android
In this article, we explore various parallel and distributed computing topics
from a user-centric software engineering perspective. Specifically, in the
context of mobile application development, we study the basic building blocks
of interactive applications in the form of events, timers, and asynchronous
activities, along with related software modeling, architecture, and design
topics.Comment: Submitted to CDER NSF/IEEE-TCPP Curriculum Initiative on Parallel and
Distributed Computing - Core Topics for Undergraduate
Programming with process groups: Group and multicast semantics
Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics
Large scale graph processing is a major research area for Big Data
exploration. Vertex centric programming models like Pregel are gaining traction
due to their simple abstraction that allows for scalable execution on
distributed systems naturally. However, there are limitations to this approach
which cause vertex centric algorithms to under-perform due to poor compute to
communication overhead ratio and slow convergence of iterative superstep. In
this paper we introduce GoFFish a scalable sub-graph centric framework
co-designed with a distributed persistent graph storage for large scale graph
analytics on commodity clusters. We introduce a sub-graph centric programming
abstraction that combines the scalability of a vertex centric approach with the
flexibility of shared memory sub-graph computation. We map Connected
Components, SSSP and PageRank algorithms to this model to illustrate its
flexibility. Further, we empirically analyze GoFFish using several real world
graphs and demonstrate its significant performance improvement, orders of
magnitude in some cases, compared to Apache Giraph, the leading open source
vertex centric implementation.Comment: Under review by a conference, 201
A distributed Real-Time Java system based on CSP
CSP is a fundamental concept for developing software for distributed real time systems. The CSP paradigm constitutes a natural addition to object orientation and offers higher order multithreading constructs. The CSP channel concept that has been implemented in Java deals with single- and multi-processor environments and also takes care of the real time priority scheduling requirements. For this, the notion of priority and scheduling has been carefully examined and as a result it was reasoned that priority scheduling should be attached to the communicating channels rather than to the processes. In association with channels, a priority based parallel construct is developed for composing processes: hiding threads and priority indexing from the user. This approach simplifies the use of priorities for the object oriented paradigm. Moreover, in the proposed system, the notion of scheduling is no longer connected to the operating system but has become part of the application instead
- …