15 research outputs found

    Handling of Congestion in Cluster Computing Environment Using Mobile Agent Approach

    Get PDF
    Computer networks have experienced an explosive growth over the past few years and with that growth have come severe congestion problems. Congestion must be prevented in order to maintain good network performance. In this paper, we proposed a cluster based framework to control congestion over network using mobile agent. The cluster implementation involves the designing of a server which manages the configuring, resetting of cluster. Our framework handles - the generation of application mobile code, its distribution to appropriate client, efficient handling of results, so generated and communicated by a number of client nodes and recording of execution time of application. The client node receives and executes the mobile code that defines the distributed job submitted by server and replies the results back. We have also the analyzed the performance of the developed system emphasizing the tradeoff between communication and computation overhead. The effectiveness of proposed framework is analyzed using JDK 1.5

    Predictive models for bandwidth sharing in high performance clusters

    Get PDF
    International audienceUsing MPI as communication interface, one or several applications may introduce complex communication behaviors over the network cluster. This effect is increased when nodes of the cluster are multi-processors, and where communications can income or outgo from the same node with a common interval time. Our goal is to understand those behaviors to build a class of predictive models of bandwidth sharing, knowing, on the one hand the flow control mechanisms and, on the other hand, a set of experimental results. This paper present experiences that show how is shared the bandwidth on Gigabit Ethernet, Myrinet 2000 and Infiniband network before to introduce the models for Gigabit Ethernet and Myrinet 2000 networks

    Implementación paralela del algoritmo backpropagation en un cluster de computadoras

    Get PDF
    En la actualidad existe una gran diversidad de arquitecturas paralelas, en donde se han definido muchos modelos de implementación del algoritmo de entrenamiento de redes neuronales backpropagation o propagación hacia atrás. La obtención de una buena performance depende básicamente del modelo de implementación y la arquitectura paralela disponible. El presente trabajo consiste en la definición de un modelo de implementación del algoritmo de backpropagation en un ambiente de programación paralela y su implementación utilizando un cluster de computadoras conectadas entre sí. Esta implementación permite ser utilizada, no necesariamente en una arquitectura paralela especifica en donde la comunicación no introduce un overhead, como ocurre en la utilización de un clustrer de estaciones de trabajo. De esta forma la implementación realizada permite no solo, mejoras en la performance del algoritmo de entrenamiento sino que además mediante la utilización de un parámetro ε que se utiliza para determinar que un cambio en un peso de entrenamiento superior a ε debe ser considerado, en otro caso no.VII Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Implementación paralela del algoritmo backpropagation en un cluster de computadoras

    Get PDF
    En la actualidad existe una gran diversidad de arquitecturas paralelas, en donde se han definido muchos modelos de implementación del algoritmo de entrenamiento de redes neuronales backpropagation o propagación hacia atrás. La obtención de una buena performance depende básicamente del modelo de implementación y la arquitectura paralela disponible. El presente trabajo consiste en la definición de un modelo de implementación del algoritmo de backpropagation en un ambiente de programación paralela y su implementación utilizando un cluster de computadoras conectadas entre sí. Esta implementación permite ser utilizada, no necesariamente en una arquitectura paralela especifica en donde la comunicación no introduce un overhead, como ocurre en la utilización de un clustrer de estaciones de trabajo. De esta forma la implementación realizada permite no solo, mejoras en la performance del algoritmo de entrenamiento sino que además mediante la utilización de un parámetro ε que se utiliza para determinar que un cambio en un peso de entrenamiento superior a ε debe ser considerado, en otro caso no.VII Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    High-Performance Message Passing over generic Ethernet Hardware with Open-MX

    Get PDF
    International audienceIn the last decade, cluster computing has become the most popular high-performance computing architecture. Although numerous technological innovations have been proposed to improve the interconnection of nodes, many clusters still rely on commodity Ethernet hardware to implement message passing within parallel applications. We present Open-MX, an open-source message passing stack over generic Ethernet. It offers the same abilities as the specialized Myrinet Express stack, without requiring dedicated support from the networking hardware. Open-MX works transparently in the most popular MPI implementations through its MX interface compatibility. It also enables interoperability between hosts running the specialized MX stack and generic Ethernet hosts. We detail how Open-MX copes with the inherent limitations of the Ethernet hardware to satisfy the requirements of message passing by applying an innovative copy offload model. Combined with a careful tuning of the fabric and of the MX wire protocol, Open-MX achieves better performance than TCP implementations, especially on 10 gigabit/s hardware

    Architecture des données pour les PME dans le contexte de l’Industrie 4.0 : une étude exploratoire

    Get PDF
    La quatrième révolution industrielle stimule l’innovation dans tous les secteurs de l’industrie pour les grandes et les petites entreprises, grâce à la convergence technologique. Dans le secteur manufacturier, l’Industrie 4.0 (I4.0) représente les changements accélérés par les technologies émergentes. Des chercheurs et des praticiens ont conçu des cadres pour évaluer l’état de préparation des entreprises à I4.0 en vue de soutenir la transformation numérique. Cependant, il existe toujours un écart de connaissances entre la vision et les technologies de I4.0 requises pour atteindre les objectifs opérationnels. De plus, dans le contexte des petites et moyennes entreprises (PME) ces problématiques sont encore plus présentes, considérant les ressources limitées auxquelles elles ont accès. Cette étude vise à combler ce manque et a pour objectif de présenter un outil exploratoire pour soutenir la définition des modèles de conception de l’architecture I4.0 pour les PME manufacturières. La méthodologie de recherche par la science de la conception a été adoptée afin de guider la réalisation de cette recherche qui a débuté par une revue de la littérature a été effectuée pour identifier les différents contextes pour l’utilisation de la donnée dans l’I4.0 et de l’architecture des données. Basée sur les recherches précédentes et données collectées auprès du terrain, l’outil exploratoire conçu fournit des recommandations sur les possibilités d’architecture des données dans le contexte de I4.0. Les conclusions offrent aux chercheurs une perspective originale qui établit un pont entre les nécessités architecturales et l’utilisation de la donnée. Les recommandations tiennent compte des capacités organisationnelles et technologiques pour la valorisation de la données et la création de valeur dans le contexte des PME manufacturiers. De plus, les praticiens seront en mesure d’appliquer facilement les modèles d’architecture et leurs différentes étapes de transition afin de réduire les risques tout en permettant d’accélérer la transformation technologique des entreprises manufacturières

    Improving the performance of parallel scientific applications using cache injection

    Get PDF
    Cache injection is a viable technique to improve the performance of data-intensive parallel applications. This dissertation characterizes cache injection of incoming network data in terms of parallel application performance. My results show that the benefit of this technique is dependent on: the ratio of processor speed to memory speed, the cache injection policy, and the application\u27s communication characteristics. Cache injection addresses the memory wall for I/O by writing data into a processor\u27s cache directly from the I/O bus. This technique, unlike data prefetching, reduces the number of reads served by the memory unit. This reduction is significant for data-intensive applications whose performance is dominated by compulsory cache misses and cannot be alleviated by traditional caching systems. Unlike previous work on cache injection which focused on reducing host network stack overhead incurred by memory copies, I show that applications can directly benefit from this technique based on their temporal and spatial locality in accessing incoming network data. I also show that the performance of cache injection is directly proportional to the ratio of processor speed to memory speed. In other words, systems with a memory wall can provide significantly better performance with cache injection and an appropriate injection policy. This result implies that multi-core and many-core architectures would benefit from this technique. Finally, my results show that the application\u27s communication characteristics are key to cache injection performance. For example, cache injection can improve the performance of certain collective communication operations by up to 20% as a function of message size

    Parallel Processes in HPX: Designing an Infrastructure for Adaptive Resource Management

    Get PDF
    Advancement in cutting edge technologies have enabled better energy efficiency as well as scaling computational power for the latest High Performance Computing(HPC) systems. However, complexity, due to hybrid architectures as well as emerging classes of applications, have shown poor computational scalability using conventional execution models. Thus alternative means of computation, that addresses the bottlenecks in computation, is warranted. More precisely, dynamic adaptive resource management feature, both from systems as well as application\u27s perspective, is essential for better computational scalability and efficiency. This research presents and expands the notion of Parallel Processes as a placeholder for procedure definitions, targeted at one or more synchronous domains, meta data for computation and resource management as well as infrastructure for dynamic policy deployment. In addition to this, the research presents additional guidelines for a framework for resource management in HPX runtime system. Further, this research also lists design principles for scalability of Active Global Address Space (AGAS), a necessary feature for Parallel Processes. Also, to verify the usefulness of Parallel Processes, a preliminary performance evaluation of different task scheduling policies is carried out using two different applications. The applications used are: Unbalanced Tree Search, a reference dynamic graph application, implemented by this research in HPX and MiniGhost, a reference stencil based application using bulk synchronous parallel model. The results show that different scheduling policies provide better performance for different classes of applications; and for the same application class, in certain instances, one policy fared better than the others, while vice versa in other instances, hence supporting the hypothesis of the need of dynamic adaptive resource management infrastructure, for deploying different policies and task granularities, for scalable distributed computing
    corecore