286 research outputs found

    A Survey on Transactional Stream Processing

    Full text link
    Transactional stream processing (TSP) strives to create a cohesive model that merges the advantages of both transactional and stream-oriented guarantees. Over the past decade, numerous endeavors have contributed to the evolution of TSP solutions, uncovering similarities and distinctions among them. Despite these advances, a universally accepted standard approach for integrating transactional functionality with stream processing remains to be established. Existing TSP solutions predominantly concentrate on specific application characteristics and involve complex design trade-offs. This survey intends to introduce TSP and present our perspective on its future progression. Our primary goals are twofold: to provide insights into the diverse TSP requirements and methodologies, and to inspire the design and development of groundbreaking TSP systems

    Self-stabilizing sorting algorithms

    Full text link
    A distributed system consists of a set of machines which do not share a global memory. Depending on the connectivity of the network, each machine gets a partial view of the global state. Transient failures in one area of the network may go unnoticed in other areas and may cause the system to go to an illegal global state. However, if the system were self-stabilizing, it would be guaranteed that regardless of the current state, the system would recover to a legal configuration in a finite number of moves; The traditional way of creating reliable systems is to make redundant components. Self-stabilization allows systems to be fault tolerant through software as well. This is an evolving paradigm in the design of robust distributed systems. The ability to recover spontaneously from an arbitrary state makes self-stabilizing systems immune to transient failures or perturbations in the system state such as changes in network topology; This thesis presents an O(nh) fault-tolerant distributed sorting algorithm for a tree network, where n is the number of nodes in the system, and h is the height of the tree. Fault-tolerance is achieved using Dijkstra\u27s paradigm of self-stabilization which is a method of non-masking fault-tolerance embedding the fault-tolerance within the algorithm. Varghese\u27s counter flushing method is used in order to achieve synchronization among processes in the system. In the distributed sorting problem each node is given a value and an id which are non-corruptible. The idea is to have each node take a specific value based on its id. The algorithm handles transient faults by weeding out false information in the system. Nodes can start with completely false information concerning the values and ids of the system yet the intended behavior is still achieved. Also, nodes are allowed to crash and re-enter the system later as well as allowing new nodes to enter the system

    An occam Style Communications System for UNIX Networks

    Get PDF
    This document describes the design of a communications system which provides occam style communications primitives under a Unix environment, using TCP/IP protocols, and any number of other protocols deemed suitable as underlying transport layers. The system will integrate with a low overhead scheduler/kernel without incurring significant costs to the execution of processes within the run time environment. A survey of relevant occam and occam3 features and related research is followed by a look at the Unix and TCP/IP facilities which determine our working constraints, and a description of the T9000 transputer's Virtual Channel Processor, which was instrumental in our formulation. Drawing from the information presented here, a design for the communications system is subsequently proposed. Finally, a preliminary investigation of methods for lightweight access control to shared resources in an environment which does not provide support for critical sections, semaphores, or busy waiting, is made. This is presented with relevance to mutual exclusion problems which arise within the proposed design. Future directions for the evolution of this project are discussed in conclusion

    Virtualized Network Graph Design and Embedding Model to Minimize Provisioning Cost

    Get PDF
    The provisioning cost of a virtualized network (VN) depends on several factors, including the numbers of virtual routers (VRs) and virtual links (VLs), mapping of them on a substrate infrastructure, and routing of data traffic. An existing model, known as the virtual network embedding (VNE) model, determines the embedding of given VN graphs into the substrate infrastructure. When the resource allocation model of the VNE problem is adopted to a single-entity scenario, where a single entity fulfills the roles of both a service provider and an infrastructure provider, an issue of increased costs of VNs and access paths arise. This paper proposes a model for virtualized network graph design and embedding (VNDE) for the single-entity scenario. The VNDE model determines the number of VRs and a VN graph for each request in conjunction with embedding. The VNDE model also determines access paths that connect customer premises and VRs. We formulate the VNDE model as an integer linear programming (ILP) problem. We develop heuristic algorithms for the cases where the ILP problem cannot be solved in practical time. We evaluate the performance of the VNDE model on several networks, including an actual Japanese academic backbone network. Numerical results show that the proposed model designs suitable VN graphs and embeds them according to the volume of traffic demands and access path cost. Compared with the benchmark model, which is based on a classic VNE approach, the proposed model reduces the provisioning cost at most 28.7% in our examined scenarios

    Uniform deployment of mobile agents in asynchronous rings

    Get PDF
    In this paper, we consider the uniform deployment problem of mobile agents in asynchronous unidirectional rings, which requires the agents to uniformly spread in the ring. The uniform deployment problem is in striking contrast to the rendezvous problem which requires the agents to meet at the same node. While rendezvous aims to break the symmetry, uniform deployment aims to attain the symmetry. It is well known that the symmetry breaking is difficult in distributed systems and the rendezvous problem cannot be solved from some initial configurations. Hence, we are interested in clarifying what difference the uniform deployment problem has on the solvability and the number of agent moves compared to the rendezvous problem. We consider two problem settings, with knowledge of k (or n) and without knowledge of k or n where k is the number of agents and n is the number of nodes. First, we consider agents with knowledge of k (or n since k and n can be easily obtained if one of them is given). In this case, we propose two algorithms. The first algorithm solves the uniform deployment problem with termination detection. This algorithm requires O(k log n) memory space per agent, O(n) time, and O(kn) total moves. The second algorithm also solves the uniform deployment problem with termination detection. This algorithm reduces the memory space per agent to O(log n), but uses O(n log k) time, and requires O(kn) total moves. Both algorithms are asymptotically optimal in terms of total moves since there are some initial configurations such that agents re- quire Ω(kn) total moves to solve the problem. Next, we consider agents with no knowledge of k or n. In this case, we show that, when termination detection is required, there exists no algorithm to solve the uniform deployment problem. For this reason, we consider the relaxed uniform deployment problem that does not require termination detection, and we propose an algorithm to solve the relaxed uniform deployment problem. This algorithm requires O((k/l) log(n/l)) memory space per agent, O(n/l) time, and O(kn/l) total moves when the initial configuration has symmetry degree l. This means that the algorithm can solve the problem more efficiently when the initial configuration has higher symmetric degree (i.e., is closer to uniform deployment). Note that all the proposed algorithms achieve uniform deployment from any initial configuration, which is a striking difference from the rendezvous problem because the rendezvous problem is not solvable from some initial configurations

    DBKnot: A Transparent and Seamless, Pluggable Tamper Evident Database

    Get PDF
    Database integrity is crucial to organizations that rely on databases of important data. They suffer from the vulnerability to internal fraud. Database tampering by internal malicious employees with high technical authorization to their infrastructure or even compromised by externals is one of the important attack vectors. This thesis addresses such challenge in a class of problems where data is appended only and is immutable. Examples of operations where data does not change is a) financial institutions (banks, accounting systems, stock market, etc., b) registries and notary systems where important data is kept but is never subject to change, and c) system logs that must be kept intact for performance and forensic inspection if needed. The target of the approach is implementation seamlessness with little-or-no changes required in existing systems. Transaction tracking for tamper detection is done by utilizing a common hashtable that serially and cumulatively hashes transactions together while using an external time-stamper and signer to sign such linkages together. This allows transactions to be tracked without any of the organizations’ data leaving their premises and going to any third-party which also reduces the performance impact of tracking. This is done so by adding a tracking layer and embedding it inside the data workflow while keeping it as un-invasive as possible. DBKnot implements such features a) natively into databases, or b) embedded inside Object Relational Mapping (ORM) frameworks, and finally c) outlines a direction of implementing it as a stand-alone microservice reverse-proxy. A prototype ORM and database layer has been developed and tested for seamlessness of integration and ease of use. Additionally, different models of optimization by implementing pipelining parallelism in the hashing/signing process have been tested in order to check their impact on performance. Stock-market information was used for experimentation with DBKnot and the initial results gave a slightly less than 100% increase in transaction time by using the most basic, sequential, and synchronous version of DBKnot. Signing and hashing overhead does not show significant increase per record with the increased amount of data. A number of different alternate optimizations were done to the design that via testing have resulted in significant increase in performance

    Network-Compute Co-Design for Distributed In-Memory Computing

    Get PDF
    The booming popularity of online services is rapidly raising the demands for modern datacenters. In order to cope with data deluge, growing user bases, and tight quality of service constraints, service providers deploy massive datacenters with tens to hundreds of thousands of servers, keeping petabytes of latency-critical data memory resident. Such data distribution and the multi-tiered nature of the software used by feature-rich services results in frequent inter-server communication and remote memory access over the network. Hence, networking takes center stage in datacenters. In response to growing internal datacenter network traffic, networking technology is rapidly evolving. Lean user-level protocols, like RDMA, and high-performance fabrics have started making their appearance, dramatically reducing datacenter-wide network latency and offering unprecedented per-server bandwidth. At the same time, the end of Dennard scaling is grinding processor performance improvements to a halt. The net result is a growing mismatch between the per-server network and compute capabilities: it will soon be difficult for a server processor to utilize all of its available network bandwidth. Restoring balance between network and compute capabilities requires tighter co-design of the two. The network interface (NI) is of particular interest, as it lies on the boundary of network and compute. In this thesis, we focus on the design of an NI for a lightweight RDMA-like protocol and its full integration with modern manycore server processors. The NI capabilities scale with both the increasing network bandwidth and the growing number of cores on modern server processors. Leveraging our architecture's integrated NI logic, we introduce new functionality at the network endpoints that yields performance improvements for distributed systems. Such additions include new network operations with stronger semantics tailored to common application requirements and integrated logic for balancing network load across a modern processor's multiple cores. We make the case that exposing richer, end-to-end semantics to the NI is a unique enabler for optimizations that can reduce software complexity and remove significant load from the processor, contributing towards maintaining balance between the two valuable resources of network and compute. Overall, network-compute co-design is an approach that addresses challenges associated with the emerging technological mismatch of compute and networking capabilities, yielding significant performance improvements for distributed memory systems
    corecore