1,419 research outputs found
Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies
Grid is an infrastructure that involves the integrated and collaborative use
of computers, networks, databases and scientific instruments owned and managed
by multiple organizations. Grid applications often involve large amounts of
data and/or computing resources that require secure resource sharing across
organizational boundaries. This makes Grid application management and
deployment a complex undertaking. Grid middlewares provide users with seamless
computing ability and uniform access to resources in the heterogeneous Grid
environment. Several software toolkits and systems have been developed, most of
which are results of academic research projects, all over the world. This
chapter will focus on four of these middlewares--UNICORE, Globus, Legion and
Gridbus. It also presents our implementation of a resource broker for UNICORE
as this functionality was not supported in it. A comparison of these systems on
the basis of the architecture, implementation model and several other features
is included.Comment: 19 pages, 10 figure
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Distributed D3: A web-based distributed data visualisation framework for Big Data
The influx of Big Data has created an ever-growing need for analytic tools targeting towards the acquisition of insights and knowledge from large datasets. Visual perception as a fundamental tool used by humans to retrieve information from the outside world around us has its unique ability to distinguish patterns pre-attentively. Visual analytics via data visualisations is therefore a very powerful tool and has become ever more important in this era. Data-Driven Documents (D3.js) is a versatile and popular web-based data visualisation library that has tended to be the standard toolkit for visualising data in recent years. However, the library is technically inherent and limited in capability by the single thread model of a single browser window in a single machine, and therefore not able to deal with large datasets. The main objective of this thesis is to overcome this limitation and address possible challenges by developing the Distributed D3 framework that employs distributed mechanism to enable the possibility of delivering web-based visualisations for large-scale data, which also allows to effectively utilise the graphical computational resources of the modern visualisation environments. As a result, the first contribution is that the integrated version of Distributed D3 framework has been developed for the Data Observatory. The work proves the concept of Distributed D3 is feasible in reality and also enables developers to collaborate on large-scale data visualisations by using it on the Data Observatory. The second contribution is that the Distributed D3 has been optimised by investigating the potential bottlenecks for large-scale data visualisation applications. The work finds the key performance bottlenecks of the framework and shows an improvement of the overall performance by 35.7% after optimisations, which improves the scalability and usability of Distributed D3 for large-scale data visualisation applications. The third contribution is that the generic version of Distributed D3 framework has been developed for the customised environments. The work improves the usability and flexibility of the framework and makes it ready to be published in the open-source community for further improvements and usages.Open Acces
Space communications responsive to events across missions (SCREAM): an investigation of network solutions for transient science space systems
2022 Spring.Includes bibliographical references.The National Academies have prioritized the pursuit of new scientific discoveries using diverse and temporally coordinated measurements from multiple ground and space-based observatories. Networked communications can enable such measurements by connecting individual observatories and allowing them to operate as a cohesive and purposefully designed system. Timely data flows across terrestrial and space communications networks are required to observe transient scientific events and processes. Currently, communications to space-based observatories experience large latencies due to manual service reservation and scheduling procedures, intermittent signal coverage, and network capacity constraints. If space communications network latencies could be reduced, new discoveries about dynamic scientific processes could be realized. However, science mission and network planners lack a systematic framework for defining, quantifying and evaluating timely space data flow implementation options for transient scientific observation scenarios involving multiple ground and space-based observatories. This dissertation presents a model-based systems engineering approach to investigate and develop network solutions to meet the needs of transient science space systems. First, a systematic investigation of the current transient science operations of the National Aeronautics and Space Administration's (NASA) Tracking and Data Relay Satellite (TDRS) space data network and the Neil Gehrels Swift Observatory resulted in a formal architectural model for transient science space systems. Two methods individual missions may use to achieve timely network services were defined, quantitatively modeled, and experimentally compared. Next, the architectural model was extended to describe two alternative ways to achieve timely and autonomous space data flows to multiple space-based observatories within the context of a purposefully designed transient science observation scenario. A quantitative multipoint space data flow modeling method based in queueing theory was defined. General system suitability metrics for timeliness, throughput, and capacity were specified to support the evaluation of alternative network data flow implementations. A hypothetical design study was performed to demonstrate the multipoint data flow modeling method and to evaluate alternative data flow implementations using TDRS. The merits of a proposed future TDRS broadcast service to implement multipoint data flows were quantified and compared to expected outcomes using the as-built TDRS network. Then, the architectural model was extended to incorporate commercial network service providers. Quantitative models for Globalstar and Iridium short messaging data services were developed based on publicly available sources. Financial cost was added to the set of system suitability metrics. The hypothetical design study was extended to compare the relative suitability of the as-built TDRS network with the commercial Globalstar and Iridium networks. Finally, results from this research are being applied by NASA missions and network planners. In 2020, Swift implemented the first automated command pipeline, increasing its expected gravitational wave follow-up detection rate by greater than 400%. Current NASA technology initiatives informed by this research will enable future space-based observatories to become interoperable sensing devices connected by a diverse ecosystem of network service providers
The Semantic Web in Federated Information Systems: A Space Physics Case Study
This paper presents a new theoretical contribution that provides a middle-of-the-road approach to formal ontologies in federated information systems. NASA’s space physics domain, like many other domains, is relatively unfamiliar with the emerging Semantic Web. This work offers a new framework that garners the benefits of formal logic yet shields participants and users from the details of the technology. Moreover, the results of a case study involving the utilization of the Semantic Web within NASA’s space physics domain are presented. A real-world search and retrieval system, relying on relational database technology, is compared against a near identical system that incorporates a formal ontology. The efficiency, efficacy, and implementation details of the Semantic Web are compared against the established relational database technology
- …