100 research outputs found

    Checkpointing of parallel applications in a Grid environment

    Get PDF
    The Grid environment is generic, heterogeneous, and dynamic with lots of unreliable resources making it very exposed to failures. The environment is unreliable because it is geographically dispersed involving multiple autonomous administrative domains and it is composed of a large number of components. Examples of failures in the Grid environment can be: application crash, Grid node crash, network failures, and Grid system component failures. These types of failures can affect the execution of parallel/distributed application in the Grid environment and so, protections against these faults are crucial. Therefore, it is essential to develop efficient fault tolerant mechanisms to allow users to successfully execute Grid applications. One of the research challenges in Grid computing is to be able to develop a fault tolerant solution that will ensure Grid applications are executed reliably with minimum overhead incurred. While checkpointing is the most common method to achieve fault tolerance, there is still a lot of work to be done to improve the efficiency of the mechanism. This thesis provides an in-depth description of a novel solution for checkpointing parallel applications executed on a Grid. The checkpointing mechanism implemented allows to checkpoint an application at regions where there is no interprocess communication involved and therefore reducing the checkpointing overhead and checkpoint size

    Sintonización dinámica de aplicaciones MPI

    Get PDF
    En la actualidad, la computación de altas prestaciones está siendo utilizada en multitud de campos científicos donde los distintos problemas estudiados se resuelven mediante aplicaciones paralelas/distribuidas. Estas aplicaciones requieren gran capacidad de cómputo, bien sea por la complejidad de los problemas o por la necesidad de solventar situaciones en tiempo real. Por lo tanto se debe aprovechar los recursos y altas capacidades computacionales de los sistemas paralelos en los que se ejecutan estas aplicaciones con el fin de obtener un buen rendimiento. Sin embargo, lograr este rendimiento en una aplicación ejecutándose en un sistema es una dura tarea que requiere un alto grado de experiencia, especialmente cuando se trata de aplicaciones que presentan un comportamiento dinámico o cuando se usan sistemas heterogéneos. En estos casos actualmente se plantea realizar una mejora de rendimiento automática y dinámica de las aplicaciones como mejor enfoque para el análisis del rendimiento. El presente trabajo de investigación se sitúa dentro de este ámbito de estudio y su objetivo principal es sintonizar dinámicamente mediante MATE (Monitoring, Analysis and Tuning Environment) una aplicación MPI empleada en computación de altas prestaciones que siga un paradigma Master/Worker. Las técnicas de sintonización integradas en MATE han sido desarrolladas a partir del estudio de un modelo de rendimiento que refleja los cuellos de botella propios de aplicaciones situadas bajo un paradigma Master/Worker: balanceo de carga y número de workers. La ejecución de la aplicación elegida bajo el control dinámico de MATE y de la estrategia de sintonización implementada ha permitido observar la adaptación del comportamiento de dicha aplicación a las condiciones actuales del sistema donde se ejecuta, obteniendo así una mejora de su rendimiento.En l'actualitat, la computació d'altes prestacions està sent utilitzada en multitud de camps científics on els diferents problemes estudiats es resolen mitjançant aplicacions paral·leles/distribuïdes. Aquestes aplicacions requereixen gran capacitat de còmput, bé sigui per la complexitat dels problemes o per la necessitat de solucionar situacions en temps real. Per tant s'ha d'aprofitar els recursos i altes capacitats computacionals dels sistemes paral·lels en els quals s'executen aquestes aplicacions amb la finalitat d'obtenir un bon rendiment. No obstant això, assolir aquest rendiment en una aplicació executant-se en un sistema és una tasca complexa que requereix d'un alt grau d'experiència, especialment quan es tracta d'aplicacions que presenten un comportament dinàmic o quan s'usen sistemes heterogenis. En aquests casos actualment es planteja realitzar una millora de rendiment automàtica i dinàmica de les aplicacions com la millor via per l'anàlisi del rendiment. El present treball d'investigació es situa dins d'aquest àmbit d'estudi i el seu objectiu principal és és sintonitzar dinàmicament mitjançant MATE (Monitoring, Analysis and Tuning Environment) una aplicació MPI empleada en computació d'altes prestacions que segueixi un paradigma Master/Worker. Les tècniques de sintonització integrades en MATE han estat desenvolupades a partir de l'estudi d'un model de rendiment que reflecteix els colls d'ampolla propis d'aplicacions situades sota un paradigma Master/Worker: balanceig de càrrega i nombre de workers. L'execució de l'aplicació triada sota el control dinàmic de MATE i de l'estratègia de sintonització implementada ha permès observar l'adaptació del comportament d'aquesta aplicació a les condicions actuals del sistema on s'executa, obtenint així una millora en el seu rendiment.At the present time, high performance computing is used in a multitude of scientific fields, where the problems studied are resolved using parallel/distributed applications. These applications require an enormous computing capacity due to both the complexity of the problems and the necessity to solve them in real time situations. Therefore, the computational capacities and resources of the parallel systems, where these applications are executed, must be taken advantage of to attain this vital high performance. However, achieving high performance in applications executed in parallel systems is a complicated task that requires a high degree of experience, especially when dealing with applications with dynamic behaviour or those running on heterogenous systems. In these cases the use of automatic and dynamic performance improvements is proposed as a better approach to performance analysis. The research presented falls within this field of study and has the principle objective of dynamically tuning, using MATE (Monitoring, Analysis and Tuning Environment), an MPI application which employs high performance computing following the Master/Worker paradigm. The tuning techniques integrated in MATE have been developed following a study of the performance model that reflects the bottlenecks specific to the Master/Worker paradigm: load balancing and the number of workers. The execution of the chosen application under the dynamic control of MATE using the tuning strategies implemented has permitted the observation of the behaviour of said application adapting to the changing conditions in the system where it is being executed, thus obtaining an improvement in the performance

    The 2nd Conference of PhD Students in Computer Science

    Get PDF

    Technology 2003: The Fourth National Technology Transfer Conference and Exposition, volume 2

    Get PDF
    Proceedings from symposia of the Technology 2003 Conference and Exposition, Dec. 7-9, 1993, Anaheim, CA, are presented. Volume 2 features papers on artificial intelligence, CAD&E, computer hardware, computer software, information management, photonics, robotics, test and measurement, video and imaging, and virtual reality/simulation

    A web-based approach to engineering adaptive collaborative applications

    Get PDF
    Current methods employed to develop collaborative applications have to make decisions and speculate about the environment in which the application will operate within, the network infrastructure that will be used and the device type the application will operate on. These decisions and assumptions about the environment in which collaborative applications were designed to work are not ideal. These methods produce collaborative applications that are characterised as being inflexible, working on homogeneous networks and single platforms, requiring pre-existing knowledge of the data and information types they need to use and having a rigid choice of architecture. On the other hand, future collaborative applications are required to be flexible; to work in highly heterogeneous environments; be adaptable to work on different networks and on a range of device types. This research investigates the role that the Web and its various pervasive technologies along with a component-based Grid middleware can play to address these concerns. The aim is to develop an approach to building adaptive collaborative applications that can operate on heterogeneous and changing environments. This work proposes a four-layer model that developers can use to build adaptive collaborative applications. The four-layer model is populated with Web technologies such as Scalable Vector Graphics (SVG), the Resource Description Framework (RDF), Protocol and RDF Query Language (SPARQL) and Gridkit, a middleware infrastructure, based on the Open Overlays concept. The Middleware layer (the first layer of the four-layer model) addresses network and operating system heterogeneity, the Group Communication layer enables collaboration and data sharing, while the Knowledge Representation layer proposes an interoperable RDF data modelling language and a flexible storage facility with an adaptive architecture for heterogeneous data storage. And finally there is the Presentation and Interaction layer which proposes a framework (Oea) for scalable and adaptive user interfaces. The four layer model has been successfully used to build a collaborative application, called Wildfurt that overcomes challenges facing collaborative applications. This research has demonstrated new applications for cutting-edge Web technologies in the area of building collaborative applications. SVG has been used for developing superior adaptive and scalable user interfaces that can operate on different device types. RDF and RDFS, have also been used to design and model collaborative applications providing a mechanism to define classes and properties and the relationships between them. A flexible and adaptable storage facility that is able to change its architecture based on the surrounding environments and requirements has also been achieved by combining the RDF technology with the Open Overlays middleware, Gridkit

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Directors's Discretionary Fund Report For Fiscal Year 1995

    Get PDF
    This technical memorandum contains brief technical papers describing research and technology development programs sponsored by the Ames Research Center Director's Discretionary Fund during the fiscal year 1995 (October 1994 through September 1995). An appendix provides administrative information for each of the sponsored research programs

    RODOS: decision support for nuclear emergencies

    Get PDF
    corecore