192 research outputs found

    Improving low latency applications for reconfigurable devices

    Get PDF
    This thesis seeks to improve low latency application performance via architectural improvements in reconfigurable devices. This is achieved by improving resource utilisation and access, and by exploiting the different environments within which reconfigurable devices are deployed. Our first contribution leverages devices deployed at the network level to enable the low latency processing of financial market data feeds. Financial exchanges transmit messages via two identical data feeds to reduce the chance of message loss. We present an approach to arbitrate these redundant feeds at the network level using a Field-Programmable Gate Array (FPGA). With support for any messaging protocol, we evaluate our design using the NASDAQ TotalView-ITCH, OPRA, and ARCA data feed protocols, and provide two simultaneous outputs: one prioritising low latency, and one prioritising high reliability with three dynamically configurable windowing methods. Our second contribution is a new ring-based architecture for low latency, parallel access to FPGA memory. Traditional FPGA memory is formed by grouping block memories (BRAMs) together and accessing them as a single device. Our architecture accesses these BRAMs independently and in parallel. Targeting memory-based computing, which stores pre-computed function results in memory, we benefit low latency applications that rely on: highly-complex functions; iterative computation; or many parallel accesses to a shared resource. We assess square root, power, trigonometric, and hyperbolic functions within the FPGA, and provide a tool to convert Python functions to our new architecture. Our third contribution extends the ring-based architecture to support any FPGA processing element. We unify E heterogeneous processing elements within compute pools, with each element implementing the same function, and the pool serving D parallel function calls. Our implementation-agnostic approach supports processing elements with different latencies, implementations, and pipeline lengths, as well as non-deterministic latencies. Compute pools evenly balance access to processing elements across the entire application, and are evaluated by implementing eight different neural network activation functions within an FPGA.Open Acces

    Modeling Data-Plane Power Consumption of Future Internet Architectures

    Full text link
    With current efforts to design Future Internet Architectures (FIAs), the evaluation and comparison of different proposals is an interesting research challenge. Previously, metrics such as bandwidth or latency have commonly been used to compare FIAs to IP networks. We suggest the use of power consumption as a metric to compare FIAs. While low power consumption is an important goal in its own right (as lower energy use translates to smaller environmental impact as well as lower operating costs), power consumption can also serve as a proxy for other metrics such as bandwidth and processor load. Lacking power consumption statistics about either commodity FIA routers or widely deployed FIA testbeds, we propose models for power consumption of FIA routers. Based on our models, we simulate scenarios for measuring power consumption of content delivery in different FIAs. Specifically, we address two questions: 1) which of the proposed FIA candidates achieves the lowest energy footprint; and 2) which set of design choices yields a power-efficient network architecture? Although the lack of real-world data makes numerous assumptions necessary for our analysis, we explore the uncertainty of our calculations through sensitivity analysis of input parameters

    NetFPGA: status, uses, developments, challenges, and evaluation

    Get PDF
    The constant growth of the Internet, driven by the demand for timely access to data center networks; has meant that the technological platforms necessary to achieve this purpose are outside the current budgets. In this order to make and validate relevant, timely and relevant contributions; it is necessary that a wider community, access to evaluation, experimentation and demonstration environments with specifications that can be compared with existing networking solutions. This article introduces the NetFPGA, which is a platform to develop network hardware for reconfigurable and rapid prototyping. It’s introduces the application areas in high-performance networks, advantages for traffic analysis, packet flow, hardware acceleration, power consumption and parallel processing in real time. Likewise, it presents the advantages of the platform for research, education, innovation, and future trends of this platform. Finally, we present a performance evaluation of the tool called OSNT (Open-Source Network Tester) and shows that OSNT has 95% accuracy of timestamp with resolution of 10ns for the generation of TCP traffic, and 90% efficiency capturing packets at 10Gbps of full line-rate

    Techniques for Processing TCP/IP Flow Content in Network Switches at Gigabit Line Rates

    Get PDF
    The growth of the Internet has enabled it to become a critical component used by businesses, governments and individuals. While most of the traffic on the Internet is legitimate, a proportion of the traffic includes worms, computer viruses, network intrusions, computer espionage, security breaches and illegal behavior. This rogue traffic causes computer and network outages, reduces network throughput, and costs governments and companies billions of dollars each year. This dissertation investigates the problems associated with TCP stream processing in high-speed networks. It describes an architecture that simplifies the processing of TCP data streams in these environments and presents a hardware circuit capable of TCP stream processing on multi-gigabit networks for millions of simultaneous network connections. Live Internet traffic is analyzed using this new TCP processing circuit

    A High-Throughput Hardware Implementation of NAT Traversal For IPSEC VPN

    Get PDF
    In this paper, we present a high-throughput FPGA implementation of IPSec core. The core supports both NAT and non-NAT mode and can be used in high speed security gateway devices. Although IPSec ESP is very computing intensive for its cryptography process, our implementation shows that it can achieve high throughput and low lantency. The system is realized on the Zynq XC7Z045 from Xilinx and was verified and tested in practice. Results show that the design can gives a peak throughput of 5.721 Gbps for the IPSec ESP tunnel mode in NAT mode and 7.753 Gbps in non-NAT mode using one single AES encrypt core. We also compare the performance of the core when running in other mode of encryption
    corecore