157 research outputs found

    Survey and Analysis of Production Distributed Computing Infrastructures

    Full text link
    This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures

    Quality of service based data-aware scheduling

    Get PDF
    Distributed supercomputers have been widely used for solving complex computational problems and modeling complex phenomena such as black holes, the environment, supply-chain economics, etc. In this work we analyze the use of these distributed supercomputers for time sensitive data-driven applications. We present the scheduling challenges involved in running deadline sensitive applications on shared distributed supercomputers running large parallel jobs and introduce a ``data-aware\u27\u27 scheduling paradigm that overcomes these challenges by making use of Quality of Service classes for running applications on shared resources. We evaluate the new data-aware scheduling paradigm using an event-driven hurricane simulation framework which attempts to run various simulations modeling storm surge, wave height, etc. in a timely fashion to be used by first responders and emergency officials. We further generalize the work and demonstrate with examples how data-aware computing can be used in other applications with similar requirements

    Choosing between remote I/O versus staging in distributed environments

    Get PDF
    Today, scientifi_x000C_c applications and experiments have become increasingly complex and more demanding in terms of their computational and data requirements. The amount of data generated and used has grown at a very rapid rate. As tens or hundreds of terabytes of data for a single application is very common today; petabytes and even exabytes of data will be very common in a few years. One of the major challenges in distributed computing environments is how to access these large datasets remotely over the network. Data staging and remote I/O are the most widely used data access methods for distributed applications. Application developers generally chose one over the other intuitively without making any scienti_x000C_fic comparison specifi_x000C_c to their applications since there is no generic model available that they can use. In this thesis, we develop generic models and set guidelines for the application developers which would help them to choose the most appropriate data access method for their application. We de_x000C_fine the parameters that potentially aff_x000B_ect the end-to-end performance of the distributed applications which need to access remote data. To achieve our goal, we implement a series of synthetic benchmark applications to simulate di_x000B_fferent data access patterns. We run these benchmark applications on diff_x000B_erent distributed computing settings with di_x000B_fferent parameters, such as network bandwidth, server and client capabilities, and data access ratio. We also use di_x000B_fferent remote I/O protocols to show the importance of the protocol in making a decision. We use regression analysis to develop applicable generic models for comparing diff_x000B_erent data access methods, and test our models in a real life application. The main contribution of our thesis is generic models that can be applied to most data-intensive distributed applications to decide the best data access technique for those applications. Our models provide the scientists and application developers an opportunity to choose the best data access method before actually running the application

    Advancing Intra-operative Precision: Dynamic Data-Driven Non-Rigid Registration for Enhanced Brain Tumor Resection in Image-Guided Neurosurgery

    Full text link
    During neurosurgery, medical images of the brain are used to locate tumors and critical structures, but brain tissue shifts make pre-operative images unreliable for accurate removal of tumors. Intra-operative imaging can track these deformations but is not a substitute for pre-operative data. To address this, we use Dynamic Data-Driven Non-Rigid Registration (NRR), a complex and time-consuming image processing operation that adjusts the pre-operative image data to account for intra-operative brain shift. Our review explores a specific NRR method for registering brain MRI during image-guided neurosurgery and examines various strategies for improving the accuracy and speed of the NRR method. We demonstrate that our implementation enables NRR results to be delivered within clinical time constraints while leveraging Distributed Computing and Machine Learning to enhance registration accuracy by identifying optimal parameters for the NRR method. Additionally, we highlight challenges associated with its use in the operating room

    Performance and quality of service of data and video movement over a 100 Gbps testbed

    Get PDF
    AbstractDigital instruments and simulations are creating an ever-increasing amount of data. The need for institutions to acquire these data and transfer them for analysis, visualization, and archiving is growing as well. In parallel, networking technology is evolving, but at a much slower rate than our ability to create and store data. Single fiber 100 Gbps networking solutions have recently been deployed as national infrastructure. This article describes our experiences with data movement and video conferencing across a networking testbed, using the first commercially available single fiber 100 Gbps technology. The testbed is unique in its ability to be configured for a total length of 60, 200, or 400 km, allowing for tests with varying network latency. We performed low-level TCP tests and were able to use more than 99.9% of the theoretical available bandwidth with minimal tuning efforts. We used the Lustre file system to simulate how end users would interact with a remote file system over such a high performance link. We were able to use 94.4% of the theoretical available bandwidth with a standard file system benchmark, essentially saturating the wide area network. Finally, we performed tests with H.323 video conferencing hardware and quality of service (QoS) settings, showing that the link can reliably carry a full high-definition stream. Overall, we demonstrated the practicality of 100 Gbps networking and Lustre as excellent tools for data management

    Application benchmark results for Big Red, an IBM e1350 BladeCenter Cluster

    Get PDF
    The purpose of this report is to present the results of benchmark tests with Big Red, an IBM e1350 BladeCenter Cluster. This report is particularly focused on providing details of system architecture and test run results in detail to allow for analysis in other reports and comparison with other systems, rather than presenting such analysis here

    Development, verification, and maintenance of computational software in geodynamics

    Get PDF
    Research on dynamical processes within the Earth and planets increasingly relies upon sophisticated, large-scale computational models. Improved understanding of fundamental physical processes such as mantle convection and the geodynamo, magma dynamics, crustal and lithospheric deformation, earthquake nucleation, and seismic wave propagation, are heavily dependent upon better numerical modeling. Surprisingly, the rate-limiting factor for progress in these areas is not just computing hardware, as was once the case. Rather, advances in software are not keeping pace with the recent improvements in hardware. Modeling tools in geophysics are usually developed and maintained by individual scientists, or by small groups. But it is difficult for any individual, or even a small group, to keep up with sweeping advances in computing hardware, parallel processing software, and numerical modeling methodology
    • …
    corecore