Search CORE

1,039 research outputs found

Recommended from our members

On Multicast in Asynchronous Networks-on-Chip: Techniques, Architectures, and FPGA Implementation

Author: Bhardwaj Kshitij
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

In this era of exascale computing, conventional synchronous design techniques are facing unprecedented challenges. The consumer electronics market is replete with many-core systems in the range of 16 cores to thousands of cores on chip, integrating multi-billion transistors. However, with this ever increasing complexity, the traditional design approaches are facing key issues such as increasing chip power, process variability, aging, thermal problems, and scalability. An alternative paradigm that has gained significant interest in the last decade is asynchronous design. Asynchronous designs have several potential advantages: they are naturally energy proportional, burning power only when active, do not require complex clock distribution, are robust to different forms of variability, and provide ease of composability for heterogeneous platforms. Networks-on-chip (NoCs) is an interconnect paradigm that has been introduced to deal with the ever-increasing system complexity. NoCs provide a distributed, scalable, and efficient interconnect solution for today’s many-core systems. Moreover, NoCs are a natural match with asynchronous design techniques, as they separate communication infrastructure and timing from the computational elements. To this end, globally-asynchronous locally-synchronous (GALS) systems that interconnect multiple processing cores, operating at different clock speeds, using an asynchronous NoC, have gained significant interest. While asynchronous NoCs have several advantages, they also face a key challenge of supporting new types of traffic patterns. Once such pattern is multicast communication, where a source sends packets to arbitrary number of destinations. Multicast is not only common in parallel computing, such as for cache coherency, but also for emerging areas such as neuromorphic computing. This important capability has been largely missing from asynchronous NoCs. This thesis introduces several efficient multicast solutions for these interconnects. In particular, techniques, and network architectures are introduced to support high-performance and low-power multicast. Two leading network topologies are the focus: a variant mesh-of-trees (MoT) and a 2D mesh. In addition, for a more realistic implementation and analysis, as well as significantly advancing the field of asynchronous NoCs, this thesis also targets synthesis of these NoCs on commercial FPGAs. While there has been significant advances in FPGA technologies, there has been only limited research on implementing asynchronous NoCs on FPGAs. To this end, a systematic computeraided design (CAD) methodology has been introduced to efficiently and safely map asynchronous NoCs on FPGAs. Overall, this thesis makes the following three contributions. The first contribution is a multicast solution for a variant MoT network topology. This topology consists of simple low-radix switches, and has been used in high-performance computing platforms. A novel local speculation technique is introduced, where a subset of the network’s switches are speculative that always broadcast every packet. These switches are very simple and have high performance. Speculative switches are surrounded by non-speculative ones that route packets based on their destinations and also throttle any redundant copies created by the former. This hybrid network architecture achieved significant performance and power benefits over other multicast approaches. The second contribution is a multicast solution for a 2D-mesh topology, which is more complex with higher-radix switches and also is more commonly used. A novel continuous-time replication strategy is introduced to optimize the critical multi-way forking operation of a multicast transmission. In this technique, a multicast packet is first stored in an input port of a switch, from where it is sent through distinct output ports towards different destinations concurrently, at each output’s own rate and in continuous time. This strategy is shown to have significant latency and energy benefits over an approach that performs multicast using multiple distinct serial unicasts to each destination. Finally, a systematic CAD methodology is introduced to synthesize asynchronous NoCs on commercial FPGAs. A two-fold goal is targeted: correctness and high performance. For ease of implementation, only existing FPGA synthesis tools are used. Moreover, since asynchronous NoCs involve special asynchronous components, a comprehensive guide is introduced to map these elements correctly and efficiently. Two asynchronous NoC switches are synthesized using the proposed approach on a leading Xilinx FPGA in 28 nm: one that only handles unicast, and the other that also supports multicast. Both showed significant energy benefits with some performance gains over a state-of-the-art synchronous switch

Columbia University Academic Commons

Methods of Congestion Control for Adaptive Continuous Media

Author: Tater Shalini
Publication venue: Oxford Brookes University
Publication date: 01/01/2002
Field of study

Since the first exchange of data between machines in different locations in early 1960s, computer networks have grown exponentially with millions of people now using the Internet. With this, there has also been a rapid increase in different kinds of services offered over the World Wide Web from simple e-mails to streaming video. It is generally accepted that the commonly used protocol suite TCP/IP alone is not adequate for a number of modern applications with high bandwidth and minimal delay requirements. Many technologies are emerging such as IPv6, Diffserv, Intserv etc, which aim to replace the onesize-fits-all approach of the current lPv4. There is a consensus that the networks will have to be capable of multi-service and will have to isolate different classes of traffic through bandwidth partitioning such that, for example, low priority best-effort traffic does not cause delay for high priority video traffic. However, this research identifies that even within a class there may be delays or losses due to congestion and the problem will require different solutions in different classes. The focus of this research is on the requirements of the adaptive continuous media class. These are traffic flows that require a good Quality of Service but are also able to adapt to the network conditions by accepting some degradation in quality. It is potentially the most flexible traffic class and therefore, one of the most useful types for an increasing number of applications. This thesis discusses the QoS requirements of adaptive continuous media and identifies an ideal feedback based control system that would be suitable for this class. A number of current methods of congestion control have been investigated and two methods that have been shown to be successful with data traffic have been evaluated to ascertain if they could be adapted for adaptive continuous media. A novel method of control based on percentile monitoring of the queue occupancy is then proposed and developed. Simulation results demonstrate that the percentile monitoring based method is more appropriate to this type of flow. The problem of congestion control at aggregating nodes of the network hierarchy, where thousands of adaptive flows may be aggregated to a single flow, is then considered. A unique method of pricing mean and variance is developed such that each individual flow is charged fairly for its contribution to the congestion

Oxford Brookes University: RADAR

On-board B-ISDN fast packet switching architectures. Phase 2: Development. Proof-of-concept architecture definition report

Author: Redman Wayne
Shyy Dong-Jye
Publication venue
Publication date
Field of study

For the next-generation packet switched communications satellite system with onboard processing and spot-beam operation, a reliable onboard fast packet switch is essential to route packets from different uplink beams to different downlink beams. The rapid emergence of point-to-point services such as video distribution, and the large demand for video conference, distributed data processing, and network management makes the multicast function essential to a fast packet switch (FPS). The satellite's inherent broadcast features gives the satellite network an advantage over the terrestrial network in providing multicast services. This report evaluates alternate multicast FPS architectures for onboard baseband switching applications and selects a candidate for subsequent breadboard development. Architecture evaluation and selection will be based on the study performed in phase 1, 'Onboard B-ISDN Fast Packet Switching Architectures', and other switch architectures which have become commercially available as large scale integration (LSI) devices

NASA Technical Reports Server

Thermal-Aware Networked Many-Core Systems

Author: Vaddina Kameswar Rao
Publication venue: Turku Centre for Computer Science
Publication date: 23/05/2014
Field of study

Advancements in IC processing technology has led to the innovation and growth happening in the consumer electronics sector and the evolution of the IT infrastructure supporting this exponential growth. One of the most difficult obstacles to this growth is the removal of large amount of heatgenerated by the processing and communicating nodes on the system. The scaling down of technology and the increase in power density is posing a direct and consequential effect on the rise in temperature. This has resulted in the increase in cooling budgets, and affects both the life-time reliability and performance of the system. Hence, reducing on-chip temperatures has become a major design concern for modern microprocessors. This dissertation addresses the thermal challenges at different levels for both 2D planer and 3D stacked systems. It proposes a self-timed thermal monitoring strategy based on the liberal use of on-chip thermal sensors. This makes use of noise variation tolerant and leakage current based thermal sensing for monitoring purposes. In order to study thermal management issues from early design stages, accurate thermal modeling and analysis at design time is essential. In this regard, spatial temperature profile of the global Cu nanowire for on-chip interconnects has been analyzed. It presents a 3D thermal model of a multicore system in order to investigate the effects of hotspots and the placement of silicon die layers, on the thermal performance of a modern ip-chip package. For a 3D stacked system, the primary design goal is to maximise the performance within the given power and thermal envelopes. Hence, a thermally efficient routing strategy for 3D NoC-Bus hybrid architectures has been proposed to mitigate on-chip temperatures by herding most of the switching activity to the die which is closer to heat sink. Finally, an exploration of various thermal-aware placement approaches for both the 2D and 3D stacked systems has been presented. Various thermal models have been developed and thermal control metrics have been extracted. An efficient thermal-aware application mapping algorithm for a 2D NoC has been presented. It has been shown that the proposed mapping algorithm reduces the effective area reeling under high temperatures when compared to the state of the art.Siirretty Doriast

UTUPub

Leveraging Conventional Internet Routing Protocol Behavior to Defeat DDoS and Adverse Networking Conditions

Author: Smith Jared M
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2020
Field of study

The Internet is a cornerstone of modern society. Yet increasingly devastating attacks against the Internet threaten to undermine the Internet\u27s success at connecting the unconnected. Of all the adversarial campaigns waged against the Internet and the organizations that rely on it, distributed denial of service, or DDoS, tops the list of the most volatile attacks. In recent years, DDoS attacks have been responsible for large swaths of the Internet blacking out, while other attacks have completely overwhelmed key Internet services and websites. Core to the Internet\u27s functionality is the way in which traffic on the Internet gets from one destination to another. The set of rules, or protocol, that defines the way traffic travels the Internet is known as the Border Gateway Protocol, or BGP, the de facto routing protocol on the Internet. Advanced adversaries often target the most used portions of the Internet by flooding the routes benign traffic takes with malicious traffic designed to cause widespread traffic loss to targeted end users and regions. This dissertation focuses on examining the following thesis statement. Rather than seek to redefine the way the Internet works to combat advanced DDoS attacks, we can leverage conventional Internet routing behavior to mitigate modern distributed denial of service attacks. The research in this work breaks down into a single arc with three independent, but connected thrusts, which demonstrate that the aforementioned thesis is possible, practical, and useful. The first thrust demonstrates that this thesis is possible by building and evaluating Nyx, a system that can protect Internet networks from DDoS using BGP, without an Internet redesign and without cooperation from other networks. This work reveals that Nyx is effective in simulation for protecting Internet networks and end users from the impact of devastating DDoS. The second thrust examines the real-world practicality of Nyx, as well as other systems which rely on real-world BGP behavior. Through a comprehensive set of real-world Internet routing experiments, this second thrust confirms that Nyx works effectively in practice beyond simulation as well as revealing novel insights about the effectiveness of other Internet security defensive and offensive systems. We then follow these experiments by re-evaluating Nyx under the real-world routing constraints we discovered. The third thrust explores the usefulness of Nyx for mitigating DDoS against a crucial industry sector, power generation, by exposing the latent vulnerability of the U.S. power grid to DDoS and how a system such as Nyx can protect electric power utilities. This final thrust finds that the current set of exposed U.S. power facilities are widely vulnerable to DDoS that could induce blackouts, and that Nyx can be leveraged to reduce the impact of these targeted DDoS attacks

University of Tennessee, Knoxville: Trace

Performance of data aggregation for wireless sensor networks

Author: Feng Jie
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2010
Field of study

This thesis focuses on three fundamental issues that concern data aggregation protocols for periodic data collection in sensor networks: which sensor nodes should report their data, when should they report it, and should they use unicast or broadcast based protocols for this purpose. The issue of when nodes should report their data is considered in the context of real-time monitoring applications. The first part of this thesis shows that asynchronous aggregation, in which the time of each node’s transmission is determined adaptively based on its local history of past packet receptions from its children, outperforms synchronous aggregation by providing lower delay for a given end-to-end loss rate. Second, new broadcast-based aggregation protocols that minimize the number of packet transmissions, relying on multipath delivery rather than automatic repeat request for reliability, are designed and evaluated. The performance of broadcast-based aggregation is compared to that of unicast-based aggregation, in the context of both real-time and delay-tolerant data collection. Finally, this thesis investigates the potential benefits of dynamically, rather than semi-statically, determining the set of nodes reporting their data, in the context of applications in which coverage of some monitored region is to be maintained. Unicast and broadcast-based coverage-preserving data aggregation protocols are designed and evaluated. The performance of the proposed protocols is compared to that of data collection protocols relying on node scheduling

eCommons@USASK

University of Saskatchewan Research Archive