223 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Resilient and Scalable Forwarding for Software-Defined Networks with P4-Programmable Switches

    Get PDF
    Traditional networking devices support only fixed features and limited configurability. Network softwarization leverages programmable software and hardware platforms to remove those limitations. In this context the concept of programmable data planes allows directly to program the packet processing pipeline of networking devices and create custom control plane algorithms. This flexibility enables the design of novel networking mechanisms where the status quo struggles to meet high demands of next-generation networks like 5G, Internet of Things, cloud computing, and industry 4.0. P4 is the most popular technology to implement programmable data planes. However, programmable data planes, and in particular, the P4 technology, emerged only recently. Thus, P4 support for some well-established networking concepts is still lacking and several issues remain unsolved due to the different characteristics of programmable data planes in comparison to traditional networking. The research of this thesis focuses on two open issues of programmable data planes. First, it develops resilient and efficient forwarding mechanisms for the P4 data plane as there are no satisfying state of the art best practices yet. Second, it enables BIER in high-performance P4 data planes. BIER is a novel, scalable, and efficient transport mechanism for IP multicast traffic which has only very limited support of high-performance forwarding platforms yet. The main results of this thesis are published as 8 peer-reviewed and one post-publication peer-reviewed publication. The results cover the development of suitable resilience mechanisms for P4 data planes, the development and implementation of resilient BIER forwarding in P4, and the extensive evaluations of all developed and implemented mechanisms. Furthermore, the results contain a comprehensive P4 literature study. Two more peer-reviewed papers contain additional content that is not directly related to the main results. They implement congestion avoidance mechanisms in P4 and develop a scheduling concept to find cost-optimized load schedules based on day-ahead forecasts

    p\ell_p-Regression in the Arbitrary Partition Model of Communication

    Full text link
    We consider the randomized communication complexity of the distributed p\ell_p-regression problem in the coordinator model, for p(0,2]p\in (0,2]. In this problem, there is a coordinator and ss servers. The ii-th server receives Ai{M,M+1,,M}n×dA^i\in\{-M, -M+1, \ldots, M\}^{n\times d} and bi{M,M+1,,M}nb^i\in\{-M, -M+1, \ldots, M\}^n and the coordinator would like to find a (1+ϵ)(1+\epsilon)-approximate solution to minxRn(iAi)x(ibi)p\min_{x\in\mathbb{R}^n} \|(\sum_i A^i)x - (\sum_i b^i)\|_p. Here Mpoly(nd)M \leq \mathrm{poly}(nd) for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For p=2p = 2, i.e., least squares regression, we give the first optimal bound of Θ~(sd2+sd/ϵ)\tilde{\Theta}(sd^2 + sd/\epsilon) bits. For p(1,2)p \in (1,2),we obtain an O~(sd2/ϵ+sd/poly(ϵ))\tilde{O}(sd^2/\epsilon + sd/\mathrm{poly}(\epsilon)) upper bound. Notably, for dd sufficiently large, our leading order term only depends linearly on 1/ϵ1/\epsilon rather than quadratically. We also show communication lower bounds of Ω(sd2+sd/ϵ2)\Omega(sd^2 + sd/\epsilon^2) for p(0,1]p\in (0,1] and Ω(sd2+sd/ϵ)\Omega(sd^2 + sd/\epsilon) for p(1,2]p\in (1,2]. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020)

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Lifting Theorems Meet Information Complexity: Known and New Lower Bounds of Set-disjointness

    Full text link
    Set-disjointness problems are one of the most fundamental problems in communication complexity and have been extensively studied in past decades. Given its importance, many lower bound techniques were introduced to prove communication lower bounds of set-disjointness. Combining ideas from information complexity and query-to-communication lifting theorems, we introduce a density increment argument to prove communication lower bounds for set-disjointness: We give a simple proof showing that a large rectangle cannot be 00-monochromatic for multi-party unique-disjointness. We interpret the direct-sum argument as a density increment process and give an alternative proof of randomized communication lower bounds for multi-party unique-disjointness. Avoiding full simulations in lifting theorems, we simplify and improve communication lower bounds for sparse unique-disjointness. Potential applications to be unified and improved by our density increment argument are also discussed.Comment: Working Pape

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    Design and Code Optimization for Systems with Next-generation Racetrack Memories

    Get PDF
    With the rise of computationally expensive application domains such as machine learning, genomics, and fluids simulation, the quest for performance and energy-efficient computing has gained unprecedented momentum. The significant increase in computing and memory devices in modern systems has resulted in an unsustainable surge in energy consumption, a substantial portion of which is attributed to the memory system. The scaling of conventional memory technologies and their suitability for the next-generation system is also questionable. This has led to the emergence and rise of nonvolatile memory ( NVM ) technologies. Today, in different development stages, several NVM technologies are competing for their rapid access to the market. Racetrack memory ( RTM ) is one such nonvolatile memory technology that promises SRAM -comparable latency, reduced energy consumption, and unprecedented density compared to other technologies. However, racetrack memory ( RTM ) is sequential in nature, i.e., data in an RTM cell needs to be shifted to an access port before it can be accessed. These shift operations incur performance and energy penalties. An ideal RTM , requiring at most one shift per access, can easily outperform SRAM . However, in the worst-cast shifting scenario, RTM can be an order of magnitude slower than SRAM . This thesis presents an overview of the RTM device physics, its evolution, strengths and challenges, and its application in the memory subsystem. We develop tools that allow the programmability and modeling of RTM -based systems. For shifts minimization, we propose a set of techniques including optimal, near-optimal, and evolutionary algorithms for efficient scalar and instruction placement in RTMs . For array accesses, we explore schedule and layout transformations that eliminate the longer overhead shifts in RTMs . We present an automatic compilation framework that analyzes static control flow programs and transforms the loop traversal order and memory layout to maximize accesses to consecutive RTM locations and minimize shifts. We develop a simulation framework called RTSim that models various RTM parameters and enables accurate architectural level simulation. Finally, to demonstrate the RTM potential in non-Von-Neumann in-memory computing paradigms, we exploit its device attributes to implement logic and arithmetic operations. As a concrete use-case, we implement an entire hyperdimensional computing framework in RTM to accelerate the language recognition problem. Our evaluation shows considerable performance and energy improvements compared to conventional Von-Neumann models and state-of-the-art accelerators

    The White-Box Adversarial Data Stream Model

    Full text link
    We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1L_1-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1L_1-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any two-player deterministic communication lower bound to a lower bound for {\it randomized} algorithms robust to a white-box adversary. In particular, our results show that for all p0p\ge 0, there exists a constant Cp>1C_p>1 such that any CpC_p-approximation algorithm for FpF_p moment estimation in insertion-only streams with a white-box adversary requires Ω(n)\Omega(n) space for a universe of size nn. Similarly, there is a constant C>1C>1 such that any CC-approximation algorithm in an insertion-only stream for matrix rank requires Ω(n)\Omega(n) space with a white-box adversary. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. (Abstract shortened to meet arXiv limits.)Comment: PODS 202

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum
    corecore