889 research outputs found
Automatic synthesis and optimization of chip multiprocessors
The microprocessor technology has experienced an enormous growth during the last decades. Rapid downscale of the CMOS technology has led to higher operating frequencies and performance densities, facing the fundamental issue of power dissipation. Chip Multiprocessors (CMPs) have become the latest paradigm to improve the power-performance efficiency of computing systems by exploiting the parallelism inherent in applications. Industrial and prototype implementations have already demonstrated the benefits achieved by CMPs with hundreds of cores.CMP architects are challenged to take many complex design decisions. Only a few of them are:- What should be the ratio between the core and cache areas on a chip?- Which core architectures to select?- How many cache levels should the memory subsystem have?- Which interconnect topologies provide efficient on-chip communication?These and many other aspects create a complex multidimensional space for architectural exploration. Design Automation tools become essential to make the architectural exploration feasible under the hard time-to-market constraints. The exploration methods have to be efficient and scalable to handle future generation on-chip architectures with hundreds or thousands of cores.Furthermore, once a CMP has been fabricated, the need for efficient deployment of the many-core processor arises. Intelligent techniques for task mapping and scheduling onto CMPs are necessary to guarantee the full usage of the benefits brought by the many-core technology. These techniques have to consider the peculiarities of the modern architectures, such as availability of enhanced power saving techniques and presence of complex memory hierarchies.This thesis has several objectives. The first objective is to elaborate the methods for efficient analytical modeling and architectural design space exploration of CMPs. The efficiency is achieved by using analytical models instead of simulation, and replacing the exhaustive exploration with an intelligent search strategy. Additionally, these methods incorporate high-level models for physical planning. The related contributions are described in Chapters 3, 4 and 5 of the document.The second objective of this work is to propose a scalable task mapping algorithm onto general-purpose CMPs with power management techniques, for efficient deployment of many-core systems. This contribution is explained in Chapter 6 of this document.Finally, the third objective of this thesis is to address the issues of the on-chip interconnect design and exploration, by developing a model for simultaneous topology customization and deadlock-free routing in Networks-on-Chip. The developed methodology can be applied to various classes of the on-chip systems, ranging from general-purpose chip multiprocessors to application-specific solutions. Chapter 7 describes the proposed model.The presented methods have been thoroughly tested experimentally and the results are described in this dissertation. At the end of the document several possible directions for the future research are proposed
Effects of Communication Protocol Stack Offload on Parallel Performance in Clusters
The primary research objective of this dissertation is to demonstrate that the effects of communication protocol stack offload (CPSO) on application execution time can be attributed to the following two complementary sources. First, the application-specific computation may be executed concurrently with the asynchronous communication performed by the communication protocol stack offload engine. Second, the protocol stack processing can be accelerated or decelerated by the offload engine. These two types of performance effects can be quantified with the use of the degree of overlapping Do and degree of acceleration Daccs. The composite communication speedup metrics S_comm(Do, Daccs) can be used in order to quantify the combined effects of the protocol stack offload. This dissertation thesis is validated empirically. The degree of overlapping Do, the degree of acceleration Daccs, and the communication speedup Scomm characteristic of the system configurations under test are derived in the course of experiments performed for the system configurations of interest. It is shown that the proposed metrics adequately describe the effects of the protocol stack offload on the application execution time. Additionally, a set of analytical models of the networking subsystem of a PC-based cluster node is developed. As a result of the modeling, the metrics Do, Daccs, and Scomm are obtained. The models are evaluated as to their complexity and precision by comparing the modeling results with the measured values of Do, Daccs, and Scomm. The primary contributions of this dissertation research are as follows. First, the metric Daccs and Scomm are introduced in order to complement the Do metric in its use for evaluation of the effects of optimizations in the networking subsystem on parallel performance in clusters. The metrics are shown to adequately describe CPSO performance effects. Second, a method for assessing performance effects of CPSO scenarios on application performance is developed and presented. Third, a set of analytical models of cluster node networking subsystems with CPSO capability is developed and characterised as to their complexity and precision of the prediction of the Do and Daccs metrics
Getting routers out of the core: Building an optical wide area network with "multipaths"
We propose an all-optical networking solution for a wide area network (WAN)
based on the notion of multipoint-to-multipoint lightpaths that, for short, we
call "multipaths". A multipath concentrates the traffic of a group of source
nodes on a wavelength channel using an adapted MAC protocol and multicasts this
traffic to a group of destination nodes that extract their own data from the
confluent stream. The proposed network can be built using existing components
and appears less complex and more efficient in terms of energy consumption than
alternatives like OPS and OBS. The paper presents the multipath architecture
and compares its energy consumption to that of a classical router-based ISP
network. A flow-aware dynamic bandwidth allocation algorithm is proposed and
shown to have excellent performance in terms of throughput and delay
Clustering Algorithms for Scale-free Networks and Applications to Cloud Resource Management
In this paper we introduce algorithms for the construction of scale-free
networks and for clustering around the nerve centers, nodes with a high
connectivity in a scale-free networks. We argue that such overlay networks
could support self-organization in a complex system like a cloud computing
infrastructure and allow the implementation of optimal resource management
policies.Comment: 14 pages, 8 Figurs, Journa
Recommended from our members
A router architecture for QoS capable clusters
Interconnection Networks have been used as a high performance communication fabric in parallel processor architectures. Parallel processors built using off-the-shelf components, called clusters, are becoming increasingly attractive because of their high performance to cost ratio over parallel computers.
Many web servers and database servers make efficient use of clustering from cost, scalability and availability standpoints. The Design of high performance cluster networks with QoS guarantees is becoming increasingly important to support a variety of multimedia applications, many of which have real time constraints. Most commercial routers, which are based on the wormhole switching paradigm, can deliver high performance, but lack QoS provisioning. A new router architecture with support for QoS provisioning was introduced in [1]. In this project we present a detailed analysis of the hardware complexity of the router in [1] and propose some architectural modifications to reduce the hardware complexity of the router. We have also developed a simulator to compare and analyze the performance characteristics of the proposed router architecture
Recommended from our members
Performance analysis and improvement of InfiniBand networks. Modelling and effective Quality-of-Service mechanisms for interconnection networks in cluster computing systems.
The InfiniBand Architecture (IBA) network has been proposed as a new
industrial standard with high-bandwidth and low-latency suitable for constructing
high-performance interconnected cluster computing systems. This architecture
replaces the traditional bus-based interconnection with a switch-based network for
the server Input-Output (I/O) and inter-processor communications. The efficient
Quality-of-Service (QoS) mechanism is fundamental to ensure the import at QoS
metrics, such as maximum throughput and minimum latency, leaving aside other
aspects like guarantee to reduce the delay, blocking probability, and mean queue
length, etc.
Performance modelling and analysis has been and continues to be of great
theoretical and practical importance in the design and development of
communication networks. This thesis aims to investigate efficient and cost-effective
QoS mechanisms for performance analysis and improvement of InfiniBand
networks in cluster-based computing systems.
Firstly, a rate-based source-response link-by-link admission and congestion
control function with improved Explicit Congestion Notification (ECN) packet
marking scheme is developed. This function adopts the rate control to reduce
congestion of multiple-class traffic. Secondly, a credit-based flow control scheme is
presented to reduce the mean queue length, throughput and response time of the system. In order to evaluate the performance of this scheme, a new queueing
network model is developed. Theoretical analysis and simulation experiments show
that these two schemes are quite effective and suitable for InfiniBand networks.
Finally, to obtain a thorough and deep understanding of the performance attributes
of InfiniBand Architecture network, two efficient threshold function flow control
mechanisms are proposed to enhance the QoS of InfiniBand networks; one is Entry
Threshold that sets the threshold for each entry in the arbitration table, and other is
Arrival Job Threshold that sets the threshold based on the number of jobs in each
Virtual Lane. Furthermore, the principle of Maximum Entropy is adopted to analyse
these two new mechanisms with the Generalized Exponential (GE)-Type
distribution for modelling the inter-arrival times and service times of the input traffic.
Extensive simulation experiments are conducted to validate the accuracy of the
analytical models
An Overview on Application of Machine Learning Techniques in Optical Networks
Today's telecommunication networks have become sources of enormous amounts of
widely heterogeneous data. This information can be retrieved from network
traffic traces, network alarms, signal quality indicators, users' behavioral
data, etc. Advanced mathematical tools are required to extract meaningful
information from these data and take decisions pertaining to the proper
functioning of the networks from the network-generated data. Among these
mathematical tools, Machine Learning (ML) is regarded as one of the most
promising methodological approaches to perform network-data analysis and enable
automated network self-configuration and fault management. The adoption of ML
techniques in the field of optical communication networks is motivated by the
unprecedented growth of network complexity faced by optical networks in the
last few years. Such complexity increase is due to the introduction of a huge
number of adjustable and interdependent system parameters (e.g., routing
configurations, modulation format, symbol rate, coding schemes, etc.) that are
enabled by the usage of coherent transmission/reception technologies, advanced
digital signal processing and compensation of nonlinear effects in optical
fiber propagation. In this paper we provide an overview of the application of
ML to optical communications and networking. We classify and survey relevant
literature dealing with the topic, and we also provide an introductory tutorial
on ML for researchers and practitioners interested in this field. Although a
good number of research papers have recently appeared, the application of ML to
optical networks is still in its infancy: to stimulate further work in this
area, we conclude the paper proposing new possible research directions
10431 Abstracts Collection -- Software Engineering for Self-Adaptive Systems
From 24.10. to 29.10.2010, the Dagstuhl Seminar 10431 ``Software Engineering for Self-Adaptive Systems\u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Towards a software defined network based multi-domain architecture for the internet of things
The current communication networks are heterogeneous, with a diversity of devices and services that challenge traditional networks, making it difficult to meet quality of service (QoS) requirements. With the advent of software-defined networks (SDN), new tools have emerged to design more flexible networks. SDN offers centralized management for data streams in distributed sensor networks.
Thus, the main goal of this dissertation is to investigate a solution that meets the QoS requirements of traffic originating on Internet of Things (IoT) devices. This traffic is transmitted to the Internet in a distributed system with multiple SDN controllers.
To achieve the goal, we designed a multi-controller network topology, each managed by its controller. Communication between the domains is done via an SDN traffic domain with the Open Network Operating System (ONOS) controller SDN-IP application. We also emulated a network to test QoS through OpenvSwitch queues. The goal is to create traffic priorities in a network with traditional and simulated IoT devices.
According to our tests, we have been able to ensure the SDN inter-domain communication and have proven that our proposal is reactive to a topology failure. In the QoS scenario we have shown that through the insertion of OpenFlow rules, we are able to prioritize traffic and provide guarantees of quality of service. This proves that our proposal is promising for use in scenarios with multiple administrative domains.As redes atuais de comunicação são heterogéneas, com uma diversidade de dispositivos e serviços, que desafiam as redes tradicionais, dificultando a satisfação dos requisitos de qualidade de serviço (QoS). Com o advento das Redes Definidas por Software (SDN), novas ferramentas surgiram para projetar redes mais flexíveis. O SDN oferece uma gestão centralizada para os fluxos de dados em redes distribuídas de sensores.
Assim, o principal objetivo desta dissertação é de investigar uma solução que cumpra os requisitos de QoS do tráfego originado em dispositivos de Internet das coisas (IoT). Este tráfego é transmitido para a Internet, num sistema distribuído com múltiplos controladores SDN. Para atingir o objetivo, projetamos uma topologia de rede com múltiplos domínios, cada um gerido pelo seu controlador. A comunicação entre os domínios, é feita através dum domínio de trânsito SDN com a aplicação SDN-IP do controlador Sistema Operativo de Rede Aberta (ONOS). Emulamos também uma rede para testar a QoS através de filas de espera do OpenvSwitch. O objetivo é criar prioridades de tráfego numa rede com dispositivos tradicionais e de IoT simulados. De acordo com os testes realizados, conseguimos garantir a comunicação entre domínios SDN e comprovamos que a nossa proposta é reativa a uma falha na topologia. No cenário do QoS demostramos que, através da inserção de regras OpenFlow, conseguimos priorizar o tráfego e oferecer garantias de qualidade de serviço. Desta forma comprovamos que a nossa proposta é promissora para ser utilizada em cenários com múltiplos domínios administrativos
- …