31 research outputs found

    New Fault Tolerant Multicast Routing Techniques to Enhance Distributed-Memory Systems Performance

    Get PDF
    Distributed-memory systems are a key to achieve high performance computing and the most favorable architectures used in advanced research problems. Mesh connected multicomputer are one of the most popular architectures that have been implemented in many distributed-memory systems. These systems must support communication operations efficiently to achieve good performance. The wormhole switching technique has been widely used in design of distributed-memory systems in which the packet is divided into small flits. Also, the multicast communication has been widely used in distributed-memory systems which is one source node sends the same message to several destination nodes. Fault tolerance refers to the ability of the system to operate correctly in the presence of faults. Development of fault tolerant multicast routing algorithms in 2D mesh networks is an important issue. This dissertation presents, new fault tolerant multicast routing algorithms for distributed-memory systems performance using wormhole routed 2D mesh. These algorithms are described for fault tolerant routing in 2D mesh networks, but it can also be extended to other topologies. These algorithms are a combination of a unicast-based multicast algorithm and tree-based multicast algorithms. These algorithms works effectively for the most commonly encountered faults in mesh networks, f-rings, f-chains and concave fault regions. It is shown that the proposed routing algorithms are effective even in the presence of a large number of fault regions and large size of fault region. These algorithms are proved to be deadlock-free. Also, the problem of fault regions overlap is solved. Four essential performance metrics in mesh networks will be considered and calculated; also these algorithms are a limited-global-information-based multicasting which is a compromise of local-information-based approach and global-information-based approach. Data mining is used to validate the results and to enlarge the sample. The proposed new multicast routing techniques are used to enhance the performance of distributed-memory systems. Simulation results are presented to demonstrate the efficiency of the proposed algorithms

    Quarc: an architecture for efficient on-chip communication

    Get PDF
    The exponential downscaling of the feature size has enforced a paradigm shift from computation-based design to communication-based design in system on chip development. Buses, the traditional communication architecture in systems on chip, are incapable of addressing the increasing bandwidth requirements of future large systems. Networks on chip have emerged as an interconnection architecture offering unique solutions to the technological and design issues related to communication in future systems on chip. The transition from buses as a shared medium to networks on chip as a segmented medium has given rise to new challenges in system on chip realm. By leveraging the shared nature of the communication medium, buses have been highly efficient in delivering multicast communication. The segmented nature of networks, however, inhibits the multicast messages to be delivered as efficiently by networks on chip. Relying on extensive research on multicast communication in parallel computers, several network on chip architectures have offered mechanisms to perform the operation, while conforming to resource constraints of the network on chip paradigm. Multicast communication in majority of these networks on chip is implemented by establishing a connection between source and all multicast destinations before the message transmission commences. Establishing the connections incurs an overhead and, therefore, is not desirable; in particular in latency sensitive services such as cache coherence. To address high performance multicast communication, this research presents Quarc, a novel network on chip architecture. The Quarc architecture targets an area-efficient, low power, high performance implementation. The thesis covers a detailed representation of the building blocks of the architecture, including topology, router and network interface. The cost and performance comparison of the Quarc architecture against other network on chip architectures reveals that the Quarc architecture is a highly efficient architecture. Moreover, the thesis introduces novel performance models of complex traffic patterns, including multicast and quality of service-aware communication

    Architectural Support for Efficient Communication in Future Microprocessors

    Get PDF
    Traditionally, the microprocessor design has focused on the computational aspects of the problem at hand. However, as the number of components on a single chip continues to increase, the design of communication architecture has become a crucial and dominating factor in defining performance models of the overall system. On-chip networks, also known as Networks-on-Chip (NoC), emerged recently as a promising architecture to coordinate chip-wide communication. Although there are numerous interconnection network studies in an inter-chip environment, an intra-chip network design poses a number of substantial challenges to this well-established interconnection network field. This research investigates designs and applications of on-chip interconnection network in next-generation microprocessors for optimizing performance, power consumption, and area cost. First, we present domain-specific NoC designs targeted to large-scale and wire-delay dominated L2 cache systems. The domain-specifically designed interconnect shows 38% performance improvement and uses only 12% of the mesh-based interconnect. Then, we present a methodology of communication characterization in parallel programs and application of characterization results to long-channel reconfiguration. Reconfigured long channels suited to communication patterns enhance the latency of the mesh network by 16% and 14% in 16-core and 64-core systems, respectively. Finally, we discuss an adaptive data compression technique that builds a network-wide frequent value pattern map and reduces the packet size. In two examined multi-core systems, cache traffic has 69% compressibility and shows high value sharing among flows. Compression-enabled NoC improves the latency by up to 63% and saves energy consumption by up to 12%

    Adaptive Hybrid Switching Technique for Parallel Computing System

    Get PDF
    Parallel processing accelerates computations by solving a single problem using multiple compute nodes interconnected by a network. The scalability of a parallel system is limited byits ability to communicate and coordinate processing. Circuit switching, packet switchingand wormhole routing are dominant switching techniques. Our simulation results show that wormhole routing and circuit switching each excel under different types of traffic.This dissertation presents a hybrid switching technique that combines wormhole routing with circuit switching in a single switch using vrtual channels and time division multiplexing. The performance of this hybrid switch is significantly impacted by the effciency of traffic scheduling and thus, this dissertation also explores the design and scalability of hardware scheduling for the hybrid switch. In particular, we introduce two schedulers for crossbar networks: a greedy scheduler and an optimal scheduler that improves upon the resultsprovided by the greedy scheduler. For the time division multiplexing portion of the hybrid switch, this dissertation presents three allocation methods that combine wormhole switching with predictive circuit switching. We further extend this research from crossbar networks to fat tree interconnected networks with virtual channels. The global "level-wise" scheduling algorithm is presented and improves network utilization by 30% when compared to a switch-level algorithm. The performance of the hybrid switching is evaluated on a cycle accurate simulation framework that is also part of this dissertation research. Our experimental results demonstrate that the hybrid switch is capable of transferring both predictable traffics and unpredictable traffics successfully. By dynamically selecting the proper switching technique based on the type of communication traffic, the hybrid switch improves communication for most types of traffic

    Efficient Interconnection Network Design for Heterogeneous Architectures

    Get PDF
    The onset of big data and deep learning applications, mixed with conventional general-purpose programs, have driven computer architecture to embrace heterogeneity with specialization. With the ever-increasing interconnected chip components, future architectures are required to operate under a stricter power budget and process emerging big data applications efficiently. Interconnection network as the communication backbone thus is facing the grand challenges of limited power envelope, data movement and performance scaling. This dissertation provides interconnect solutions that are specialized to application requirements towards power-/energy-efficient and high-performance computing for heterogeneous architectures. This dissertation examines the challenges of network-on-chip router power-gating techniques for general-purpose workloads to save static power. A voting approach is proposed as an adaptive power-gating policy that considers both local and global traffic status through router voting. In addition, low-latency routing algorithms are designed to guarantee performance in irregular power-gating networks. This holistic solution not only saves power but also avoids performance overhead. This research also introduces emerging computation paradigms to interconnects for big data applications to mitigate the pressure of data movement. Approximate network-on-chip is proposed to achieve high-throughput communication by means of lossy compression. Then, near-data processing is combined with in-network computing to further improve performance while reducing data movement. The two schemes are general to play as plug-ins for different network topologies and routing algorithms. To tackle the challenging computational requirements of deep learning workloads, this dissertation investigates the compelling opportunities of communication algorithm-architecture co-design to accelerate distributed deep learning. MultiTree allreduce algorithm is proposed to bond with message scheduling with network topology to achieve faster and contention-free communication. In addition, the interconnect hardware and flow control are also specialized to exploit deep learning communication characteristics and fulfill the algorithm needs, thereby effectively improving the performance and scalability. By considering application and algorithm characteristics, this research shows that interconnection network can be tailored accordingly to improve the power-/energy-efficiency and performance to satisfy heterogeneous computation and communication requirements

    On Collaborative Intrusion Detection

    Get PDF
    Cyber-attacks have nowadays become more frightening than ever before. The growing dependency of our society on networked systems aggravates these threats; from interconnected corporate networks and Industrial Control Systems (ICSs) to smart households, the attack surface for the adversaries is increasing. At the same time, it is becoming evident that the utilization of classic fields of security research alone, e.g., cryptography, or the usage of isolated traditional defense mechanisms, e.g., firewalls and Intrusion Detection Systems ( IDSs ), is not enough to cope with the imminent security challenges. To move beyond monolithic approaches and concepts that follow a “cat and mouse” paradigm between the defender and the attacker, cyber-security research requires novel schemes. One such promis- ing approach is collaborative intrusion detection. Driven by the lessons learned from cyber-security research over the years, the aforesaid notion attempts to connect two instinctive questions: “if we acknowledge the fact that no security mechanism can detect all attacks, can we beneficially combine multiple approaches to operate together?” and “as the adversaries increasingly collaborate (e.g., Distributed Denial of Service (DDoS) attacks from whichever larger botnets) to achieve their goals, can the defenders beneficially collude too?”. Collabora- tive intrusion detection attempts to address the emerging security challenges by providing methods for IDSs and other security mech- anisms (e.g., firewalls and honeypots) to combine their knowledge towards generating a more holistic view of the monitored network. This thesis improves the state of the art in collaborative intrusion detection in several areas. In particular, the dissertation proposes methods for the detection of complex attacks and the generation of the corresponding intrusion detection signatures. Moreover, a novel approach for the generation of alert datasets is given, which can assist researchers in evaluating intrusion detection algorithms and systems. Furthermore, a method for the construction of communities of collab- orative monitoring sensors is given, along with a domain-awareness approach that incorporates an efficient data correlation mechanism. With regard to attacks and countermeasures, a detailed methodology is presented that is focusing on sensor-disclosure attacks in the con- text of collaborative intrusion detection. The scientific contributions can be structured into the following categories: Alert data generation: This thesis deals with the topic of alert data generation in a twofold manner: first it presents novel approaches for detecting complex attacks towards generating alert signatures for IDSs ; second a method for the synthetic generation of alert data is pro- posed. In particular, a novel security mechanism for mobile devices is proposed that is able to support users in assessing the security status of their networks. The system can detect sophisticated attacks and generate signatures to be utilized by IDSs . The dissertation also touches the topic of synthetic, yet realistic, dataset generation for the evaluation of intrusion detection algorithms and systems; it proposes a novel dynamic dataset generation concept that overcomes the short- comings of the related work. Collaborative intrusion detection: As a first step, the the- sis proposes a novel taxonomy for collaborative intrusion detection ac- companied with building blocks for Collaborative IDSs ( CIDSs ). More- over, the dissertation deals with the topics of (alert) data correlation and aggregation in the context of CIDSs . For this, a number of novel methods are proposed that aim at improving the clustering of mon- itoring sensors that exhibit similar traffic patterns. Furthermore, a novel alert correlation approach is presented that can minimize the messaging overhead of a CIDS. Attacks on CIDSs: It is common for research on cyber-defense to switch its perspective, taking on the viewpoint of attackers, trying to anticipate their remedies against novel defense approaches. The the- sis follows such an approach by focusing on a certain class of attacks on CIDSs that aim at identifying the network location of the monitor- ing sensors. In particular, the state of the art is advanced by proposing a novel scheme for the improvement of such attacks. Furthermore, the dissertation proposes novel mitigation techniques to overcome both the state of art and the proposed improved attacks. Evaluation: All the proposals and methods introduced in the dis- sertation were evaluated qualitatively, quantitatively and empirically. A comprehensive study of the state of the art in collaborative intru- sion detection was conducted via a qualitative approach, identifying research gaps and surveying the related work. To study the effective- ness of the proposed algorithms and systems extensive simulations were utilized. Moreover, the applicability and usability of some of the contributions in the area of alert data generation was additionally supported via Proof of Concepts (PoCs) and prototypes. The majority of the contributions were published in peer-reviewed journal articles, in book chapters, and in the proceedings of interna- tional conferences and workshops

    WSN based sensing model for smart crowd movement with identification: a conceptual model

    Get PDF
    With the advancement of IT and increase in world population rate, Crowd Management (CM) has become a subject undergoing intense study among researchers. Technology provides fast and easily available means of transport and, up-to-date information access to the people that causes crowd at public places. This imposes a big challenge for crowd safety and security at public places such as airports, railway stations and check points. For example, the crowd of pilgrims during Hajj and Ummrah while crossing the borders of Makkah, Kingdom of Saudi Arabia. To minimize the risk of such crowd safety and security identification and verification of people is necessary which causes unwanted increment in processing time. It is observed that managing crowd during specific time period (Hajj and Ummrah) with identification and verification is a challenge. At present, many advanced technologies such as Internet of Things (IoT) are being used to solve the crowed management problem with minimal processing time. In this paper, we have presented a Wireless Sensor Network (WSN) based conceptual model for smart crowd movement with minimal processing time for people identification. This handles the crowd by forming groups and provides proactive support to handle them in organized manner. As a result, crowd can be managed to move safely from one place to another with group identification. The group identification minimizes the processing time and move the crowd in smart way
    corecore