1,419 research outputs found

    Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies

    Full text link
    Grid is an infrastructure that involves the integrated and collaborative use of computers, networks, databases and scientific instruments owned and managed by multiple organizations. Grid applications often involve large amounts of data and/or computing resources that require secure resource sharing across organizational boundaries. This makes Grid application management and deployment a complex undertaking. Grid middlewares provide users with seamless computing ability and uniform access to resources in the heterogeneous Grid environment. Several software toolkits and systems have been developed, most of which are results of academic research projects, all over the world. This chapter will focus on four of these middlewares--UNICORE, Globus, Legion and Gridbus. It also presents our implementation of a resource broker for UNICORE as this functionality was not supported in it. A comparison of these systems on the basis of the architecture, implementation model and several other features is included.Comment: 19 pages, 10 figure

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Distributed D3: A web-based distributed data visualisation framework for Big Data

    Get PDF
    The influx of Big Data has created an ever-growing need for analytic tools targeting towards the acquisition of insights and knowledge from large datasets. Visual perception as a fundamental tool used by humans to retrieve information from the outside world around us has its unique ability to distinguish patterns pre-attentively. Visual analytics via data visualisations is therefore a very powerful tool and has become ever more important in this era. Data-Driven Documents (D3.js) is a versatile and popular web-based data visualisation library that has tended to be the standard toolkit for visualising data in recent years. However, the library is technically inherent and limited in capability by the single thread model of a single browser window in a single machine, and therefore not able to deal with large datasets. The main objective of this thesis is to overcome this limitation and address possible challenges by developing the Distributed D3 framework that employs distributed mechanism to enable the possibility of delivering web-based visualisations for large-scale data, which also allows to effectively utilise the graphical computational resources of the modern visualisation environments. As a result, the first contribution is that the integrated version of Distributed D3 framework has been developed for the Data Observatory. The work proves the concept of Distributed D3 is feasible in reality and also enables developers to collaborate on large-scale data visualisations by using it on the Data Observatory. The second contribution is that the Distributed D3 has been optimised by investigating the potential bottlenecks for large-scale data visualisation applications. The work finds the key performance bottlenecks of the framework and shows an improvement of the overall performance by 35.7% after optimisations, which improves the scalability and usability of Distributed D3 for large-scale data visualisation applications. The third contribution is that the generic version of Distributed D3 framework has been developed for the customised environments. The work improves the usability and flexibility of the framework and makes it ready to be published in the open-source community for further improvements and usages.Open Acces

    eScience and Informatics for international science programs

    Get PDF

    Space communications responsive to events across missions (SCREAM): an investigation of network solutions for transient science space systems

    Get PDF
    2022 Spring.Includes bibliographical references.The National Academies have prioritized the pursuit of new scientific discoveries using diverse and temporally coordinated measurements from multiple ground and space-based observatories. Networked communications can enable such measurements by connecting individual observatories and allowing them to operate as a cohesive and purposefully designed system. Timely data flows across terrestrial and space communications networks are required to observe transient scientific events and processes. Currently, communications to space-based observatories experience large latencies due to manual service reservation and scheduling procedures, intermittent signal coverage, and network capacity constraints. If space communications network latencies could be reduced, new discoveries about dynamic scientific processes could be realized. However, science mission and network planners lack a systematic framework for defining, quantifying and evaluating timely space data flow implementation options for transient scientific observation scenarios involving multiple ground and space-based observatories. This dissertation presents a model-based systems engineering approach to investigate and develop network solutions to meet the needs of transient science space systems. First, a systematic investigation of the current transient science operations of the National Aeronautics and Space Administration's (NASA) Tracking and Data Relay Satellite (TDRS) space data network and the Neil Gehrels Swift Observatory resulted in a formal architectural model for transient science space systems. Two methods individual missions may use to achieve timely network services were defined, quantitatively modeled, and experimentally compared. Next, the architectural model was extended to describe two alternative ways to achieve timely and autonomous space data flows to multiple space-based observatories within the context of a purposefully designed transient science observation scenario. A quantitative multipoint space data flow modeling method based in queueing theory was defined. General system suitability metrics for timeliness, throughput, and capacity were specified to support the evaluation of alternative network data flow implementations. A hypothetical design study was performed to demonstrate the multipoint data flow modeling method and to evaluate alternative data flow implementations using TDRS. The merits of a proposed future TDRS broadcast service to implement multipoint data flows were quantified and compared to expected outcomes using the as-built TDRS network. Then, the architectural model was extended to incorporate commercial network service providers. Quantitative models for Globalstar and Iridium short messaging data services were developed based on publicly available sources. Financial cost was added to the set of system suitability metrics. The hypothetical design study was extended to compare the relative suitability of the as-built TDRS network with the commercial Globalstar and Iridium networks. Finally, results from this research are being applied by NASA missions and network planners. In 2020, Swift implemented the first automated command pipeline, increasing its expected gravitational wave follow-up detection rate by greater than 400%. Current NASA technology initiatives informed by this research will enable future space-based observatories to become interoperable sensing devices connected by a diverse ecosystem of network service providers

    The Semantic Web in Federated Information Systems: A Space Physics Case Study

    Get PDF
    This paper presents a new theoretical contribution that provides a middle-of-the-road approach to formal ontologies in federated information systems. NASA’s space physics domain, like many other domains, is relatively unfamiliar with the emerging Semantic Web. This work offers a new framework that garners the benefits of formal logic yet shields participants and users from the details of the technology. Moreover, the results of a case study involving the utilization of the Semantic Web within NASA’s space physics domain are presented. A real-world search and retrieval system, relying on relational database technology, is compared against a near identical system that incorporates a formal ontology. The efficiency, efficacy, and implementation details of the Semantic Web are compared against the established relational database technology
    • …
    corecore