150 research outputs found

    File Fragmentation over an Unreliable Channel

    Get PDF
    It has been recently discovered that heavy-tailed file completion time can result from protocol interaction even when file sizes are light-tailed. A key to this phenomenon is the RESTART feature where if a file transfer is interrupted before it is completed, the transfer needs to restart from the beginning. In this paper, we show that independent or bounded fragmentation guarantees light-tailed file completion time as long as the file size is light-tailed, i.e., in this case, heavy-tailed file completion time can only originate from heavy-tailed file sizes. If the file size is heavy-tailed, then the file completion time is necessarily heavy-tailed. For this case, we show that when the file size distribution is regularly varying, then under independent or bounded fragmentation, the completion time tail distribution function is asymptotically upper bounded by that of the original file size stretched by a constant factor. We then prove that if the failure distribution has non-decreasing failure rate, the expected completion time is minimized by dividing the file into equal sized fragments; this optimal fragment size is unique but depends on the file size. We also present a simple blind fragmentation policy where the fragment sizes are constant and independent of the file size and prove that it is asymptotically optimal. Finally, we bound the error in expected completion time due to error in modeling of the failure process

    Near-optimal scheduling and decision-making models for reactive and proactive fault tolerance mechanisms

    Get PDF
    As High Performance Computing (HPC) systems increase in size to fulfill computational power demand, the chance of failure occurrences dramatically increases, resulting in potentially large amounts of lost computing time. Fault Tolerance (FT) mechanisms aim to mitigate the impact of failure occurrences to the running applications. However, the overhead of FT mechanisms increases proportionally to the HPC systems\u27 size. Therefore, challenges arise in handling the expensive overhead of FT mechanisms while minimizing the large amount of lost computing time due to failure occurrences. In this dissertation, a near-optimal scheduling model is built to determine when to invoke a hybrid checkpoint mechanism, by means of stochastic processes and calculus of variations. The obtained schedule minimizes the waste time caused by checkpoint mechanism and failure occurrences. Generally, the checkpoint/restart mechanisms periodically save application states and load the saved state, upon failure occurrences. Furthermore, to handle various FT mechanisms, an adaptive decision-making model has been developed to determine the best FT strategy to invoke at each decision point. The best mechanism at each decision point is selected among considered FT mechanisms to globally minimize the total waste time for an application execution by means of a dynamic programming approach. In addition, the model is adaptive to deal with changes in failure rate over time

    Scheduling Computational Workflows on Failure-Prone Platforms

    Get PDF
    International audienceWe study the scheduling of computational workflows on compute resources that experience exponentially distributed failures. When a failure occurs, roll-back and recovery is used to resume the execution from the last checkpointed state. The scheduling problem is to minimize the expected execution time by deciding in which order to execute the tasks in the workflow and whether to checkpoint or not checkpoint a task after it completes. We give a polynomial-time algorithm for fork graphs and show that the problem is NP-complete with join graphs. Our main result is a polynomial-time algorithm to compute the execution time of a workflow with specified to-be-checkpointed tasks. Using this algorithm as a basis, we propose efficient heuristics for solving the scheduling problem. We evaluate these heuristics for representative workflow configurations

    Automated derivation of the adjoint of high-level transient finite element programs

    Full text link
    In this paper we demonstrate a new technique for deriving discrete adjoint and tangent linear models of finite element models. The technique is significantly more efficient and automatic than standard algorithmic differentiation techniques. The approach relies on a high-level symbolic representation of the forward problem. In contrast to developing a model directly in Fortran or C++, high-level systems allow the developer to express the variational problems to be solved in near-mathematical notation. As such, these systems have a key advantage: since the mathematical structure of the problem is preserved, they are more amenable to automated analysis and manipulation. The framework introduced here is implemented in a freely available software package named dolfin-adjoint, based on the FEniCS Project. Our approach to automated adjoint derivation relies on run-time annotation of the temporal structure of the model, and employs the FEniCS finite element form compiler to automatically generate the low-level code for the derived models. The approach requires only trivial changes to a large class of forward models, including complicated time-dependent nonlinear models. The adjoint model automatically employs optimal checkpointing schemes to mitigate storage requirements for nonlinear models, without any user management or intervention. Furthermore, both the tangent linear and adjoint models naturally work in parallel, without any need to differentiate through calls to MPI or to parse OpenMP directives. The generality, applicability and efficiency of the approach are demonstrated with examples from a wide range of scientific applications

    High Performance Computing Systems with Various Checkpointing Schemes

    Get PDF
    Finding the failure rate of a system is a crucial step in high performance computing systems analysis. To deal with this problem, a fault tolerant mechanism, called checkpoint/ restart technique, was introduced. However, there are additional costs to perform this mechanism. Thus, we propose two models for different schemes (full and incremental checkpoint schemes). The models which are based on the reliability of the system are used to determine the checkpoint placements. Both proposed models consider a balance of between checkpoint overhead and the re-computing time. Due to the extra costs from each incremental checkpoint during the recovery period, a method to find the number of incremental checkpoints between two consecutive full checkpoints is given. Our simulation suggests that in most cases our incremental checkpoint model can reduce the waste time more than it is reduced by the full checkpoint model. The waste times produced by both models are in the range of 2% to 28% of the application completion time depending on the checkpoint overheads

    A failure index for high performance computing applications

    Get PDF
    This dissertation introduces a new metric in the area of High Performance Computing (HPC) application reliability and performance modeling. Derived via the time-dependent implementation of an existing inequality measure, the Failure index (FI) generates a coefficient representing the level of volatility for the failures incurred by an application running on a given HPC system in a given time interval. This coefficient presents a normalized cross-system representation of the failure volatility of applications running on failure-rich HPC platforms. Further, the origin and ramifications of application failures are investigated, from which certain mathematical conclusions yield greater insight into the behavior of these applications in failure-rich system environments. This work also includes background information on the problems facing HPC applications at the highest scale, the lack of standardized application-specific metrics within this arena, and a means of generating such metrics in a low latency manner. A case study containing detailed analysis showcasing the benefits of the FI is also included
    • 

    corecore