221 research outputs found

    Failure analysis and reliability -aware resource allocation of parallel applications in High Performance Computing systems

    Get PDF
    The demand for more computational power to solve complex scientific problems has been driving the physical size of High Performance Computing (HPC) systems to hundreds and thousands of nodes. Uninterrupted execution of large scale parallel applications naturally becomes a major challenge because a single node failure interrupts the entire application, and the reliability of a job completion decreases with increasing the number of nodes. Accurate reliability knowledge of a HPC system enables runtime systems such as resource management and applications to minimize performance loss due to random failures while also providing better Quality Of Service (QOS) for computational users. This dissertation makes three major contributions for reliability evaluation and resource management in HPC systems. First we study the failure properties of HPC systems and observe that Times To Failure (TTF\u27s) of individual compute nodes follow a time-varying failure rate based distribution like Weibull distribution. We then propose a model for the TTF distribution of a system of k independent nodes when individual nodes exhibit time varying failure rates. Based on the reliability of the proposed TTF model, we develop reliability-aware resource allocation algorithms and evaluated them on actual parallel workloads and failure data of a HPC system. Our observations indicate that applying time varying failure rate-based reliability function combined with some heuristics reduce the performance loss due to unexpected failures by as much as 30 to 53 percent. Finally, we also study the effect of reliability with respect to the number of nodes and propose reliability-aware optimal k node allocation algorithm for large scale parallel applications. Our simulation results of comparing the optimal k node algorithm indicate that choosing the number of nodes for large scale parallel applications based on the reliability of compute nodes can reduce the overall completion time and waste time when the k may be smaller than the total number of nodes in the system

    A TIME-AND-SPACE PARALLELIZED ALGORITHM FOR THE CABLE EQUATION

    Get PDF
    Electrical propagation in excitable tissue, such as nerve fibers and heart muscle, is described by a nonlinear diffusion-reaction parabolic partial differential equation for the transmembrane voltage V(x,t)V(x,t), known as the cable equation. This equation involves a highly nonlinear source term, representing the total ionic current across the membrane, governed by a Hodgkin-Huxley type ionic model, and requires the solution of a system of ordinary differential equations. Thus, the model consists of a PDE (in 1-, 2- or 3-dimensions) coupled to a system of ODEs, and it is very expensive to solve, especially in 2 and 3 dimensions. In order to solve this equation numerically, we develop an algorithm, extended from the Parareal Algorithm, to efficiently incorporate space-parallelized solvers into the framework of the Parareal algorithm, to achieve time-and-space parallelization. Numerical results and comparison of the performance of several serial, space-parallelized and time-and-space-parallelized time-stepping numerical schemes in one-dimension and in two-dimensions are also presented

    Concurrent processing simulation of the space station

    Get PDF
    The development of a new capability for the time-domain simulation of multibody dynamic systems and its application to the study of a large angle rotational maneuvers of the Space Station is described. The effort was divided into three sequential tasks, which required significant advancements of the state-of-the art to accomplish. These were: (1) the development of an explicit mathematical model via symbol manipulation of a flexible, multibody dynamic system; (2) the development of a methodology for balancing the computational load of an explicit mathematical model for concurrent processing; and (3) the implementation and successful simulation of the above on a prototype Custom Architectured Parallel Processing System (CAPPS) containing eight processors. The throughput rate achieved by the CAPPS operating at only 70 percent efficiency, was 3.9 times greater than that obtained sequentially by the IBM 3090 supercomputer simulating the same problem. More significantly, analysis of the results leads to the conclusion that the relative cost effectiveness of concurrent vs. sequential digital computation will grow substantially as the computational load is increased. This is a welcomed development in an era when very complex and cumbersome mathematical models of large space vehicles must be used as substitutes for full scale testing which has become impractical

    Seedling survival responses to conspecific density, soil nutrients, and irradiance vary with age in a tropical forest

    Get PDF
    Predicting long-term trends in forest growth requires accurate characterisation of how the relationship between forest productivity and climatic stress varies across climatic regimes. Using a network of over two million tree-ring observations spanning North America and a space-for-time substitution methodology, we forecast climate impacts on future forest growth. We explored differing scenarios of increased water-use efficiency (WUE) due to CO2-fertilisation, which we simulated as increased effective precipitation. In our forecasts: (1) climate change negatively impacted forest growth rates in the interior west and positively impacted forest growth along the western, southeastern and northeastern coasts; (2) shifting climate sensitivities offset positive effects of warming on high-latitude forests, leaving no evidence for continued ‘boreal greening’; and (3) it took a 72% WUE enhancement to compensate for continentally averaged growth declines under RCP 8.5. Our results highlight the importance of locally adapted forest management strategies to handle regional differences in growth responses to climate change

    Speeding up ecological and evolutionary computations in R; essentials of high performance computing for biologists

    Get PDF
    Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1–S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research

    The dark side of network functions virtualization: A perspective on the technological sustainability

    Get PDF
    The Network Functions Virtualization (NFV) paradigm is undoubtedly a key technological advancement in the Information and Communication Technology (ICT) community, especially for the upcoming 5G network design. While most of its promise is quite straightforward, the implied reduction of the power consumption/carbon footprint is still debatable, and not in line with the energy efficiency perspective forecasted by the ETSI NFV working group (WG). In this paper, we provide an estimate of the possible future requirements of this upcoming technology when deployed according to the virtual Evolved Packet Core (vEPC) use case specified by the ETSI NFV WG. Our estimation is based on real performance levels, certified by independent third-party laboratories, and datasheet values provided by existing commercial products for both the legacy and NFV network architectures, under different deployment scenarios. Obtained results show that a massive deployment of the current NFV technologies in the EPC may lead to a minimum increase of 106 % in the carbon footprint/energy consumption with respect to the Business As Usual (BAU) network solutions. Moreover, these values tend to increase at a very high pace when the most suitable software/hardware combination is not applied, or when packet processing latency is taken into account

    Seedling survival responses to conspecific density, soil nutrients, and irradiance vary with age in a tropical forest

    Get PDF
    Predicting long-term trends in forest growth requires accurate characterisation of how the relationship between forest productivity and climatic stress varies across climatic regimes. Using a network of over two million tree-ring observations spanning North America and a space-for-time substitution methodology, we forecast climate impacts on future forest growth. We explored differing scenarios of increased water-use efficiency (WUE) due to CO2-fertilisation, which we simulated as increased effective precipitation. In our forecasts: (1) climate change negatively impacted forest growth rates in the interior west and positively impacted forest growth along the western, southeastern and northeastern coasts; (2) shifting climate sensitivities offset positive effects of warming on high-latitude forests, leaving no evidence for continued ‘boreal greening’; and (3) it took a 72% WUE enhancement to compensate for continentally averaged growth declines under RCP 8.5. Our results highlight the importance of locally adapted forest management strategies to handle regional differences in growth responses to climate change

    Energy Awareness and Scheduling in Mobile Devices and High End Computing

    Get PDF
    In the context of the big picture as energy demands rise due to growing economies and growing populations, there will be greater emphasis on sustainable supply, conservation, and efficient usage of this vital resource. Even at a smaller level, the need for minimizing energy consumption continues to be compelling in embedded, mobile, and server systems such as handheld devices, robots, spaceships, laptops, cluster servers, sensors, etc. This is due to the direct impact of constrained energy sources such as battery size and weight, as well as cooling expenses in cluster-based systems to reduce heat dissipation. Energy management therefore plays a paramount role in not only hardware design but also in user-application, middleware and operating system design. At a higher level Datacenters are sprouting everywhere due to the exponential growth of Big Data in every aspect of human life, the buzz word these days is Cloud computing. This dissertation, focuses on techniques, specifically algorithmic ones to scale down energy needs whenever the system performance can be relaxed. We examine the significance and relevance of this research and develop a methodology to study this phenomenon. Specifically, the research will study energy-aware resource reservations algorithms to satisfy both performance needs and energy constraints. Many energy management schemes focus on a single resource that is dedicated to real-time or nonreal-time processing. Unfortunately, in many practical systems the combination of hard and soft real-time periodic tasks, a-periodic real-time tasks, interactive tasks and batch tasks must be supported. Each task may also require access to multiple resources. Therefore, this research will tackle the NP-hard problem of providing timely and simultaneous access to multiple resources by the use of practical abstractions and near optimal heuristics aided by cooperative scheduling. We provide an elegant EAS model which works across the spectrum which uses a run-profile based approach to scheduling. We apply this model to significant applications such as BLAT and Assembly of gene sequences in the Bioinformatics domain. We also provide a simulation for extending this model to cloud computing to answers “what if” scenario questions for consumers and operators of cloud resources to help answers questions of deadlines, single v/s distributed cluster use and impact analysis of energy-index and availability against revenue and ROI
    • 

    corecore