1,543 research outputs found

    On distributed data processing in data grid architecture for a virtual repository

    Get PDF
    The article describes the problem of integration of distributed, heterogeneous and fragmented collections of data with application of the virtual repository and the data grid concept. The technology involves: wrappers enveloping external resources, a virtual network (based on the peer-topeer technology) responsible for integration of data into one global schema and a distributed index for speeding-up data retrieval. Authors present a method for obtaining data from heterogeneously structured external databases and then a procedure of integration the data to one, commonly available, global schema. The core of the described solution is based on the Stack-Based Query Language (SBQL) and virtual updatable SBQL views. The system transport and indexing layer is based on the P2P architecture

    Transparent Persistence with Java Data Objects

    Full text link
    Flexible and performant Persistency Service is a necessary component of any HEP Software Framework. The building of a modular, non-intrusive and performant persistency component have been shown to be very difficult task. In the past, it was very often necessary to sacrifice modularity to achieve acceptable performance. This resulted in the strong dependency of the overall Frameworks on their Persistency subsystems. Recent development in software technology has made possible to build a Persistency Service which can be transparently used from other Frameworks. Such Service doesn't force a strong architectural constraints on the overall Framework Architecture, while satisfying high performance requirements. Java Data Object standard (JDO) has been already implemented for almost all major databases. It provides truly transparent persistency for any Java object (both internal and external). Objects in other languages can be handled via transparent proxies. Being only a thin layer on top of a used database, JDO doesn't introduce any significant performance degradation. Also Aspect-Oriented Programming (AOP) makes possible to treat persistency as an orthogonal Aspect of the Application Framework, without polluting it with persistence-specific concepts. All these techniques have been developed primarily (or only) for the Java environment. It is, however, possible to interface them transparently to Frameworks built in other languages, like for example C++. Fully functional prototypes of flexible and non-intrusive persistency modules have been build for several other packages, as for example FreeHEP AIDA and LCG Pool AttributeSet (package Indicium).Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003. PSN TUKT00

    Data access and integration in the ISPIDER proteomics grid

    Get PDF
    Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources

    Distributed databases

    Get PDF
    Mòdul 3 del llibre Database Architecture. UOC, 20122022/202

    Grid Database - Management, OGSA and Integration

    Get PDF
    The problem description of data models and types of databases has generated and gives rise to extensive controversy generated by their complexity, the many factors involved in the actual process of implementation. Grids encourage and promote the publication, sharing and integration of scientific data, distributed across Virtual Organizations. Scientists and researchers work on huge, complex and growing datasets. The complexity of data management within a grid environment comes from the distribution, heterogeneity and number of data sources.Early Grid applications focused principally on the storage, replication and movement of file-based data.. Many Grid applications already use databases for managing metadata, but increasingly many are associated with large databases of domain-specific information. In this paper we will talk about the fundamental concepts related to grid-database access, management, OGSA and integration

    Towards a service-oriented e-infrastructure for multidisciplinary environmental research

    Get PDF
    Research e-infrastructures are considered to have generic and thematic parts. The generic part provids high-speed networks, grid (large-scale distributed computing) and database systems (digital repositories and data transfer systems) applicable to all research commnities irrespective of discipline. Thematic parts are specific deployments of e-infrastructures to support diverse virtual research communities. The needs of a virtual community of multidisciplinary envronmental researchers are yet to be investigated. We envisage and argue for an e-infrastructure that will enable environmental researchers to develop environmental models and software entirely out of existing components through loose coupling of diverse digital resources based on the service-oriented achitecture. We discuss four specific aspects for consideration for a future e-infrastructure: 1) provision of digital resources (data, models & tools) as web services, 2) dealing with stateless and non-transactional nature of web services using workflow management systems, 3) enabling web servce discovery, composition and orchestration through semantic registries, and 4) creating synergy with existing grid infrastructures

    Federating distributed and heterogeneous information sources in neuroimaging: the NeuroBase Project.

    Get PDF
    The NeuroBase project aims at studying the requirements for federating, through the Internet, information sources in neuroimaging. These sources are distributed in different experimental sites, hospitals or research centers in cognitive neurosciences, and contain heterogeneous data and image processing programs. More precisely, this project consists in creating of a shared ontology, suitable for supporting various neuroimaging applications, and a computer architecture for accessing and sharing relevant distributed information. We briefly describe the semantic model and report in more details the architecture we chose, based on a media-tor/wrapper approach. To give a flavor of the future deployment of our architecture, we de-scribe a demonstrator that implements the comparison of distributed image processing tools applied to distributed neuroimaging data
    corecore