Search CORE

161 research outputs found

Packet Transactions: High-level Programming for Line-Rate Switches

Author: Alizadeh Mohammad
Balakrishnan Hari
Budiu Mihai
Cheung Alvin
Kim Changhoon
Licking Steve
McKeown Nick
Sivaraman Anirudh
Varghese George
Publication venue
Publication date: 29/01/2016
Field of study

Many algorithms for congestion control, scheduling, network measurement, active queue management, security, and load balancing require custom processing of packets as they traverse the data plane of a network switch. To run at line rate, these data-plane algorithms must be in hardware. With today's switch hardware, algorithms cannot be changed, nor new algorithms installed, after a switch has been built. This paper shows how to program data-plane algorithms in a high-level language and compile those programs into low-level microcode that can run on emerging programmable line-rate switching chipsets. The key challenge is that these algorithms create and modify algorithmic state. The key idea to achieve line-rate programmability for stateful algorithms is the notion of a packet transaction : a sequential code block that is atomic and isolated from other such code blocks. We have developed this idea in Domino, a C-like imperative language to express data-plane algorithms. We show with many examples that Domino provides a convenient and natural way to express sophisticated data-plane algorithms, and show that these algorithms can be run at line rate with modest estimated die-area overhead.Comment: 16 page

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Recommended from our members

Building Distributed Systems with Non-Volatile Main Memories and RDMA Networks

Author: Yang Jian
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

High-performance, byte-addressable non-volatile main memories (NVMMs) allow application developers to combine storage and memory into a single layer. These high-performance storage systems would be especially useful in large-scale data center environments where data is distributed and replicated across multiple servers.Unfortunately, existing approaches of providing remote storage access rest on the assumption that storage is slow, so the cost of the software and protocols is acceptable. Such assumption no longer holds for the fast NVMM. As a result, taking full advantage of NVMMs’ potential will require changes in system software and networking protocol. This thesis focuses on accessing remote NVMM efficiently using remote direct memory access (RDMA) network. RDMA enables a client to directly access memory on a remote machine without involving its local CPU.This thesis first presents Mojim, a system that provides replicated, reliable, and highly-available NVMM as an operating system service. Applications can access data in Mojim using normal load and store instructions while controlling when and how updates propagate to replicas using system calls. Our evaluation shows Mojim adds little overhead to the un-replicated system and provides 0.4x to 2.7x the throughput of the un-replicated system.This thesis then presents Orion, a distributed file system designed from for NVMM and RDMA networks. Traditional distributed file systems are designed for slower hard drives. These slower media incentivizes complex optimizations (e.g., queuing, striping, and batching) around disk accesses. Orion combines file system functions and network operations into a single layer. It provides low latency metadata accesses and outperforms existing distributed file systems by a large margin.Finally, an NVMM application can map files backed by an NVMM file system into its address space, and accesses them using CPU instructions. In this case, RDMA and NVMM file systems introduce duplication of effort on permissions, naming, and address translation. We introduce two changes to the existing RDMA protocol: the file memory region (FileMR) and range based address translation. By eliminating redundant translations, FileMR minimizes the number of translations done at the NIC, reducing the load on the NIC’s translation cache and resulting in application performance improvement by 1.8x - 2.0x

eScholarship - University of California

A novel program synthesis approach in test driven software development

Author: Ferencz Endre
Goldschmidt Balázs
Publication venue: 'Akademiai Kiado Zrt.'
Publication date: 01/01/2017
Field of study

It is a viable alternative to automatically generate Java source code based on the specification provided by the associated unit tests. This possibility may seem far-fetched in the general case, but after considering the most common restrictions, which are applied nowadays as best practice, it turns out that a significant part of the production code can be generated automatically. The goal is to generate viable implementations, which fulfill the requirements imposed by unit tests. According to the presented vision the modern test frameworks, development guidelines and computational capacities make it possible to reach this goal

Crossref

Repository of the Academy's Library

Avionics graphics hardware performance prediction with machine learning

Author: Bois Guy
Boland Jean-François
Girard Simon R.
Legault Vincent
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

Within the strongly regulated avionic engineering field, conventional graphical desktop hardware and software application programming interface (API) cannot be used because they do not conform to the avionic certification standards. We observe the need for better avionic graphical hardware, but system engineers lack system design tools related to graphical hardware. The endorsement of an optimal hardware architecture by estimating the performance of a graphical software, when a stable rendering engine does not yet exist, represents a major challenge. As proven by previous hardware emulation tools, there is also a potential for development cost reduction, by enabling developers to have a first estimation of the performance of its graphical engine early in the development cycle. In this paper, we propose to replace expensive development platforms by predictive software running on a desktop computer. More precisely, we present a system design tool that helps predict the rendering performance of graphical hardware based on the OpenGL Safety Critical API. First, we create nonparametric models of the underlying hardware, with machine learning, by analyzing the instantaneous frames per second (FPS) of the rendering of a synthetic 3D scene and by drawing multiple times with various characteristics that are typically found in synthetic vision applications. The number of characteristic combinations used during this supervised training phase is a subset of all possible combinations, but performance predictions can be arbitrarily extrapolated. To validate our models, we render an industrial scene with characteristic combinations not used during the training phase and we compare the predictions to those real values. We find a median prediction error of less than 4 FPS

Directory of Open Access Journals

PolyPublie

Transactions with isolation and cooperation

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Crossref

Accurate and efficient processor performance prediction via regression tree based modeling

Author: Balachandran Ramadass
Bin Li
Lu Peng
Publication venue
Publication date: 03/04/2020
Field of study

a b s t r a c t Computer architects usually evaluate new designs using cycle-accurate processor simulation. This approach provides a detailed insight into processor performance, power consumption and complexity. However, only configurations in a subspace can be simulated in practice due to long simulation time and limited resource, leading to suboptimal conclusions which might not be applied to a larger design space. In this paper, we propose a performance prediction approach which employs state-of-the-art techniques from experiment design, machine learning and data mining. According to our experiments on single and multi-core processors, our prediction model generates highly accurate estimations for unsampled points in the design space and show the robustness for the worst-case prediction. Moreover, the model provides quantitative interpretation tools that help investigators to efficiently tune design parameters and remove performance bottlenecks

CiteSeerX

Transactions with isolation and cooperation

Author: Anthony Kay
Carlstrom Brian D.
Chung JaeWoong
Gosling James
Gray Jim
Hammond Lance
Hicks Michael
Hoare CAR.
Michal Young
Reimer Behrends
Yannis Smaragdakis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Towards Porting Operating Systems with Program Synthesis

Author: Chong Stephen
Holland David A.
Hu Jingmei
Kawaguchi Ming
Lu Eric
Seltzer Margo I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/09/2022
Field of study

The end of Moore's Law has ushered in a diversity of hardware not seen in decades. Operating system (and system software) portability is accordingly becoming increasingly critical. Simultaneously, there has been tremendous progress in program synthesis. We set out to explore the feasibility of using modern program synthesis to generate the machine-dependent parts of an operating system. Our ultimate goal is to generate new ports automatically from descriptions of new machines. One of the issues involved is writing specifications, both for machine-dependent operating system functionality and for instruction set architectures. We designed two domain-specific languages: Alewife for machine-independent specifications of machine-dependent operating system functionality and Cassiopea for describing instruction set architecture semantics. Automated porting also requires an implementation. We developed a toolchain that, given an Alewife specification and a Cassiopea machine description, specializes the machine-independent specification to the target instruction set architecture and synthesizes an implementation in assembly language with a customized symbolic execution engine. Using this approach, we demonstrate successful synthesis of a total of 140 OS components from two pre-existing OSes for four real hardware platforms. We also developed several optimization methods for OS-related assembly synthesis to improve scalability. The effectiveness of our languages and ability to synthesize code for all 140 specifications is evidence of the feasibility of program synthesis for machine-dependent OS code. However, many research challenges remain; we also discuss the benefits and limitations of our synthesis-based approach to automated OS porting.Comment: ACM Transactions on Programming Languages and Systems. Accepted on August 202

arXiv.org e-Print Archive

(No)Compromis: Paging Virtualization Is Not a Fatality

Author: Hagimont Daniel
Hermenier Fabien
Muller Gilles
Tchana Alain
Teabe Djomgwe Boris
Yuhala Peterson
Publication venue: HAL CCSD
Publication date: 16/04/2021
Field of study

International audienceNested/Extended Page Table (EPT) is the current hardware solution for virtualizing memory in virtualized systems. It induces a significant performance overhead due to the 2D page walk it requires, thus 24 memory accesses on a TLB miss (instead of 4 memory accesses in a native system). This 2D page walk constraint comes from the utilization of paging for managing virtual machine (VM) memory. This paper shows that paging is not necessary in the hypervisor. Our solution Compromis, a novel Memory Management Unit, uses direct segments for VM memory management combined with paging for VM's processes. This is the first time that a direct segment based solution is shown to be applicable to the entire VM memory while keeping applications unchanged. Relying on the 310 studied datacenter traces, the paper shows that it is possible to provision up to 99.99% of the VMs using a single memory segment. The paper presents a systematic methodology for implementing Compromis in the hardware, the hypervisor and the datacenter scheduler. Evaluation results show that Compromis outperforms the two popular memory virtualization solutions: shadow paging and EPT by up to 30% and 370% respectively

INRIA a CCSD electronic archive server