164 research outputs found
Workflow level interoperation of grid data resources
The lack of widely accepted standards and the use of different middleware solutions divide today’s Grid resources into non-interoperable production Grid islands. On the other hand, more and more experiments require such a large number of resources that the interoperation of existing production Grids becomes inevitable. This paper, based on the current results of grid interoperation studies, defines generic requirements towards the workflow level interoperation of grid solutions. It concentrates on intra-workflow interoperation of grid data resources, as one of the key areas of generic interoperation, and describes through an example how existing tools can be extended to achieve the required level of interoperation
Role-Based Access Control for the Open Grid Services Architecture - Data Access and Integration (OGSA-DAI)
Grid has emerged recently as an integration infrastructure for the sharing and coordinated use of diverse resources in dynamic, distributed virtual organizations (VOs). A Data Grid is an architecture for the access, exchange, and sharing of data in the Grid environment. In this dissertation, role-based access control (RBAC) systems for heterogeneous data resources in Data Grid systems are proposed. The Open Grid Services Architecture - Data Access and Integration (OGSA-DAI) is a widely used framework for the integration of heterogeneous data resources in Grid systems.
However, in the OGSA-DAI system, access control causes substantial administration overhead for resource providers in VOs because each of them has to manage the authorization information for individual Grid users. Its identity-based access control mechanisms are severely inefficient and too complicated to manage because the direct mapping between users and privileges is transitory. To solve this problem, (1) the Community Authorization Service (CAS), provided by the Globus toolkit, and (2) the Shibboleth, an attribute authorization service, are used to support RBAC in the OGSA-DAI system. The Globus Toolkit is widely used software for building Grid systems.
Access control policies need to be specified and managed across multiple VOs. For this purpose, the Core and Hierarchical RBAC profile of the eXtensible Access Control Markup Language (XACML) is used; and for distributed administration of those policies, the Object, Metadata and Artifacts Registry (OMAR) is used. OMAR is based on the e-business eXtensible Markup Language (ebXML) registry specifications developed to achieve interoperable registries and repositories.
The RBAC systems allow quick and easy deployments, privacy protection, and the centralized and distributed management of privileges. They support scalable, interoperable and fine-grain access control services; dynamic delegation of rights; and user-role assignments. They also reduce the administration overheads for resource providers because they need to maintain only the mapping information from VO roles to local database roles. Resource providers maintain the ultimate authority over their resources. Moreover, unnecessary mapping and connections can be avoided by denying invalid requests at the VO level. Performance analysis shows that our RBAC systems add only a small overhead to the existing security infrastructure of OGSA-DAI
Accessing RDF(S) data resources in service-based Grid infrastructures
We describe the results of the RDF(S) activity within the Open Grid Forum (http://www.ogf.org) (OGF) Database Access and Integration Services (DAIS) Working Group (http://forge.gridforum.org/projects/dais-wg) whose objective is to develop standard service-based grid access mechanisms for data expressed in RDF and RDF Schema. We produce two specifications, focused on the provision of SPARQL querying capabilities for accessing RDF data and a set of RDF Schema ontology handling primitives for creating, retrieving, updating, and deleting RDF data. In this paper we present a set of use cases that justify this work and an overview of these specifications, which will enter in editorial process at OGF25. We conclude by outlining the future work that will be made in the context of this standardization process
From access and integration to mining of secure genomic data sets across the grid
The UK Department of Trade and Industry (DTI) funded BRIDGES project (Biomedical Research Informatics Delivered by Grid Enabled Services) has developed a Grid infrastructure to support cardiovascular research. This includes the provision of a compute Grid and a data Grid infrastructure with security at its heart. In this paper we focus on the BRIDGES data Grid. A primary aim of the BRIDGES data Grid is to help control the complexity in access to and integration of a myriad of genomic data sets through simple Grid based tools. We outline these tools, how they are delivered to the end user scientists. We also describe how these tools are to be extended in the BBSRC funded Grid Enabled Microarray Expression Profile Search (GEMEPS) to support a richer vocabulary of search capabilities to support mining of microarray data sets. As with BRIDGES, fine grain Grid security underpins GEMEPS
Enabling quantitative data analysis through e-infrastructures
This paper discusses how quantitative data analysis in the social sciences can engage with and exploit an e-Infrastructure. We highlight how a number of activities which are central to quantitative data analysis, referred to as ‘data management’, can benefit from e-infrastructure support. We conclude by discussing how these issues are relevant to the DAMES (Data Management through e-Social Science) research Node, an ongoing project that aims to develop e-Infrastructural resources for quantitative data analysis in the social sciences
Supporting UK-wide e-clinical trials and studies
As clinical trials and epidemiological studies become increasingly large, covering wider (national) geographical areas and involving ever broader populations, the need to provide an information management infrastructure that can support such endeavours is essential. A wealth of clinical data now exists at varying levels of care (primary care, secondary care, etc.). Simple, secure access to such data would greatly benefit the key processes involved in clinical trials and epidemiological studies: patient recruitment, data collection and study management. The Grid paradigm provides one model for seamless access to such data and support of these processes.
The VOTES project (Virtual Organisations for Trials and Epidemiological Studies) is a collaboration between several UK institutions to implement a generic framework that effectively leverages the available health-care information across the UK to support more efficient gathering and processing of trial information. The structure of the information available in the health-care domain in the UK itself varies broadly in-line with the national boundaries of the constituent states (England, Scotland, Wales and Northern Ireland). Technologies must address these political boundaries and the impact these boundaries have in terms of for example, information governance, policies, and of course large-scale heterogeneous distribution of the data sets themselves.
This paper outlines the methodology in implementing the framework between three specific data sources that serve as useful case studies: Scottish data from the Scottish Care Information (SCI) Store data repository, data on the General Practice Research Database (GPRD) diabetes trial at Imperial College London, and benign prostate hypoplasia (BPH) data from the University of Nottingham. The design, implementation and wider research issues are discussed along with the technological challenges encountered in the project in the application of Grid technologies
Data access and integration in the ISPIDER proteomics grid
Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources
The OMII Software Distribution
This paper describes the work carried out at the Open Middleware Infrastructure Institute (OMII) and the key elements of the OMII software distribution that have been developed in collaboration with members of the Managed Programme Initiative. The main objective of the OMII is to preserve and consolidate the achievements of the UK e-Science Programme by collecting, maintaining and improving the software modules that form the key components of a generic Grid middleware. Recently, the activity at Southampton has been extended beyond 2009 through a new project, OMII-UK, that forms a partnership that now includes the OGSA-DAI activities at Edinburgh and the myGrid project at Manchester
- …