593 research outputs found

    Extensible FlexRay communication controller for FPGA-based automotive systems

    Get PDF
    Modern vehicles incorporate an increasing number of distributed compute nodes, resulting in the need for faster and more reliable in-vehicle networks. Time-triggered protocols such as FlexRay have been gaining ground as the standard for high-speed reliable communications in the automotive industry, marking a shift away from the event-triggered medium access used in controller area networks (CANs). These new standards enable the higher levels of determinism and reliability demanded from next-generation safety-critical applications. Advanced applications can benefit from tight coupling of the embedded computing units with the communication interface, thereby providing functionality beyond the FlexRay standard. Such an approach is highly suited to implementation on reconfigurable architectures. This paper describes a field-programmable gate array (FPGA)-based communication controller (CC) that features configurable extensions to provide functionality that is unavailable with standard implementations or off-the-shelf devices. It is implemented and verified on a Xilinx Spartan 6 FPGA, integrated with both a logic-based hardware ECU and a fully fledged processor-based electronic control unit (ECU). Results show that the platform-centric implementation generates a highly efficient core in terms of power, performance, and resource utilization. We demonstrate that the flexible extensions help enable advanced applications that integrate features such as fault tolerance, timeliness, and security, with practical case studies. This tight integration between the controller, computational functions, and flexible extensions on the controller enables enhancements that open the door for exciting applications in future vehicles

    A Scalable Parallel Architecture with FPGA-Based Network Processor for Scientific Computing

    Get PDF
    This thesis discuss the design and the implementation of an FPGA-Based Network Processor for scientific computing, like Lattice Quantum ChromoDinamycs (LQCD) and fluid-dynamics applications based on Lattice Boltzmann Methods (LBM). State-of-the-art programs in this (and other similar) applications have a large degree of available parallelism, that can be easily exploited on massively parallel systems, provided the underlying communication network has not only high-bandwidth but also low-latency. I have designed in details, built and tested in hardware, firmware and software an implementation of a Network Processor, tailored for the most recent families of multi-core processors. The implementation has been developed on an FPGA device to easily interface the logic of NWP with the CPU I/O sub-system. In this work I have assessed several ways to move data between the main memory of the CPU and the I/O sub-system to exploit high data throughput and low latency, enabling the use of “Programmed Input Output” (PIO), “Direct Memory Access” (DMA) and “Write Combining” memory-settings. On the software side, I developed and test a device driver for the Linux operating system to access the NWP device, as well as a system library to efficiently access the network device from user-applications. This thesis demonstrates the feasibility of a network infrastructure that saturates the maximum bandwidth of the I/O sub-systems available on recent CPUs, and reduces communication latencies to values very close to those needed by the processor to move data across the chip boundary

    Real Time Control on Firewire

    Get PDF
    The goal of this project is to get insight into the use of Firewire as a field bus for real-time control. A characterization of Firewire's asynchronous transmission has been made by testing the point-to-point roundtrip in a 3-node Firewire network.\ud The results show Firewire's asynchronous transmission between 2 PC/104 stacks, using FCP (Functional Control Protocol) as the way of implementing, can transfer data in an average latency between 100Âżs and 140ÎĽs, depending on the data payload. The maximum variation in this latency is 20ÎĽs.\ud During this project, it is also found that while the payload is increased, the roundtrip time does not show a significant increase: it rises only around 40ÎĽs from 1byte payload to 449 bytes payload. So the bus is used most efficiently when the packet is fully loaded. To get a complete insight of Firewire, it is recommended to implement FirewireÂżs isochronous transmission and examine its applicability for real-time control also. And to use Firewire as a field bus in a real distributed control system with a plant is also advised to get a better understanding of Firewire

    Reproducible Host Networking Evaluation with End-to-End Simulation

    Get PDF
    Networking researchers are facing growing challenges in evaluating and reproducing results for modern network systems. As systems rely on closer integration of system components and cross-layer optimizations in the pursuit of performance and efficiency, they are also increasingly tied to specific hardware and testbed properties. Combined with a trend towards heterogeneous hardware, such as protocol offloads, SmartNICs, and in-network accelerators, researchers face the choice of either investing more and more time and resources into comparisons to prior work or, alternatively, lower the standards for evaluation. We aim to address this challenge by introducing SimBricks, a simulation framework that decouples networked systems from the physical testbed and enables reproducible end-to-end evaluation in simulation. Instead of reinventing the wheel, SimBricks is a modular framework for combining existing tried-and-true simulators for individual components, processor and memory, NIC, and network, into complete testbeds capable of running unmodified systems. In our evaluation, we reproduce key findings from prior work, including dctcp congestion control, NOPaxos in-network consensus acceleration, and the Corundum FPGA NIC.Comment: 15 pages, 10 figures, under submissio

    Evaluation of an FPGA and PCI Bus based Readout Buffer for the Atlas Experiment

    Get PDF
    This dissertation evaluates a readout buffer system for the ATLAS detector trigger and data acquisition system. ATLAS is a high energy physics experiment at the large hadron collider (LHC) with the aim to reach new frontiers in the investigation of the structure of matter. The high precision ATLAS detector produces a huge amount of data, 40 TByte/s, which is reduced by a three-level trigger system for online event data selection. The readout buffer system acts as a data buffer while the second trigger level computes the trigger decision. ATLAS uses a sequential selection in the level 2 trigger which means that all event data required for the trigger decision is requested from the readout buffer component subsequently. This increases the complexity of the readout buffer device and its output event rate. Furthermore a region-of-interest (RoI) concept limits the amount of data necessary for the processing of one event inside the level 2 processor by defining the detector region with interesting data. Thus, approximately 10 kHz output rate have to be provided while feeding ~1 kByte data packets with 100 kHz at the input. The evaluated implementation of this readout buffer should be based on commercial "of-the-shelf" hardware. Thus a conventional Linux server PC with four PCI Bus segments has been used. This approach leads to uniformity in the ATLAS data acquisition system because all hardware beginning with the second trigger level is built of similar PCs. But a standard PC is not able to meet the previously mentioned requirements. Therefore it is extended (or accelerated) by a number of PCI based FPGA co-processor boards. Considering the above mentioned sequential selection and RoI concept, such a complex buffer component based on standard server PCs and FPGA co-processors has never been investigated before in high energy physics. The FPGA co-processor is a simple component extending the PC for the time critical receiving and buffering of data. It is able to process data from four ATLAS detector links which allows the grouping of 12 to 16 links in one PC. Measurements show that this system is able to sustain the ATLAS requirements. Currently Linux OS, running on the PC system and handling the Gigabit Ethernet network I/O with the rest of the data acquisition system, is the main bottleneck. Improving this could be the subject of future investigations

    Integrating mobile and cloud resources management using the cloud personal assistant

    Get PDF
    The mobile cloud computing model promises to address the resource limitations of mobile devices, but effectively implementing this model is difficult. Previous work on mobile cloud computing has required the user to have a continuous, high-quality connection to the cloud infrastructure. This is undesirable and possibly infeasible, as the energy required on the mobile device to maintain a connection, and transfer sizeable amounts of data is large; the bandwidth tends to be quite variable, and low on cellular networks. The cloud deployment itself needs to efficiently allocate scalable resources to the user as well. In this paper, we formulate the best practices for efficiently managing the resources required for the mobile cloud model, namely energy, bandwidth and cloud computing resources. These practices can be realised with our mobile cloud middleware project, featuring the Cloud Personal Assistant (CPA). We compare this with the other approaches in the area, to highlight the importance of minimising the usage of these resources, and therefore ensure successful adoption of the model by end users. Based on results from experiments performed with mobile devices, we develop a no-overhead decision model for task and data offloading to the CPA of a user, which provides efficient management of mobile cloud resources

    Quarc: an architecture for efficient on-chip communication

    Get PDF
    The exponential downscaling of the feature size has enforced a paradigm shift from computation-based design to communication-based design in system on chip development. Buses, the traditional communication architecture in systems on chip, are incapable of addressing the increasing bandwidth requirements of future large systems. Networks on chip have emerged as an interconnection architecture offering unique solutions to the technological and design issues related to communication in future systems on chip. The transition from buses as a shared medium to networks on chip as a segmented medium has given rise to new challenges in system on chip realm. By leveraging the shared nature of the communication medium, buses have been highly efficient in delivering multicast communication. The segmented nature of networks, however, inhibits the multicast messages to be delivered as efficiently by networks on chip. Relying on extensive research on multicast communication in parallel computers, several network on chip architectures have offered mechanisms to perform the operation, while conforming to resource constraints of the network on chip paradigm. Multicast communication in majority of these networks on chip is implemented by establishing a connection between source and all multicast destinations before the message transmission commences. Establishing the connections incurs an overhead and, therefore, is not desirable; in particular in latency sensitive services such as cache coherence. To address high performance multicast communication, this research presents Quarc, a novel network on chip architecture. The Quarc architecture targets an area-efficient, low power, high performance implementation. The thesis covers a detailed representation of the building blocks of the architecture, including topology, router and network interface. The cost and performance comparison of the Quarc architecture against other network on chip architectures reveals that the Quarc architecture is a highly efficient architecture. Moreover, the thesis introduces novel performance models of complex traffic patterns, including multicast and quality of service-aware communication

    The MANGO clockless network-on-chip: Concepts and implementation

    Get PDF
    • …
    corecore