88,026 research outputs found

    Towards a service-oriented e-infrastructure for multidisciplinary environmental research

    Get PDF
    Research e-infrastructures are considered to have generic and thematic parts. The generic part provids high-speed networks, grid (large-scale distributed computing) and database systems (digital repositories and data transfer systems) applicable to all research commnities irrespective of discipline. Thematic parts are specific deployments of e-infrastructures to support diverse virtual research communities. The needs of a virtual community of multidisciplinary envronmental researchers are yet to be investigated. We envisage and argue for an e-infrastructure that will enable environmental researchers to develop environmental models and software entirely out of existing components through loose coupling of diverse digital resources based on the service-oriented achitecture. We discuss four specific aspects for consideration for a future e-infrastructure: 1) provision of digital resources (data, models & tools) as web services, 2) dealing with stateless and non-transactional nature of web services using workflow management systems, 3) enabling web servce discovery, composition and orchestration through semantic registries, and 4) creating synergy with existing grid infrastructures

    Supporting Complex Scientific Database Schemas in a Grid Middleware

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.” DOI: 10.1109/AINA.2009.129The volume of digital scientific data has increased considerably with advancing technologies of computing devices and scientific instruments. We are exploring the use of emerging Grid technologies for the management and manipulation of very large distributed scientific datasets. Taking as an example a terabyte-size scientific database with complex database schema, this paper focuses on the potential of a well-known Grid middleware - OGSA-DQP - for distributing such datasets. In particular, we investigate and extend the data type support in this system to handle a complex schema of a real scientific database - the Sloan Digital Sky Survey database

    A Grid-Enabled Infrastructure for Resource Sharing, E-Learning, Searching and Distributed Repository Among Universities

    Get PDF
    In the recent years, service-based approaches for sharing of data among repositories and online learning are rising to prominence because of their potential to meet the requirements in the area of high performance computing. Developing education based grid services and assuring high availability reliability and scalability are demanding in web service architectures. On the other hand, grid computing provides flexibility towards aggregating distributed CPU, memory, storage, data and supports large number of distributed resource sharing to provide the full potential for education like applications to share the knowledge that can be attainable on any single system. However, the literature shows that the potential of grid resources for educational purposes is not being utilized yet. In this paper, an education based grid framework architecture that provides promising platform to support sharing of geographically dispersed learning content among universities is developed. It allows students, faculty and researchers to share and gain knowledge in their area of interest by using e-learning, searching and distributed repository services among universities from anywhere, anytime. Globus toolkit 5.2.5 (GTK) software is used as grid middleware that provides resource access, discovery and management, data movement, security, and so forth. Furthermore, this work uses the OGSA-DAI that provides database access and operations. The resulting infrastructure enables users to discover education services and interact with them using the grid portal

    Performance of R-GMA for monitoring grid jobs for CMS data production

    Get PDF
    High energy physics experiments, such as the Compact Muon Solenoid (CMS) at the CERN laboratory in Geneva, have large-scale data processing requirements, with data accumulating at a rate of 1 Gbyte/s. This load comfortably exceeds any previous processing requirements and we believe it may be most efficiently satisfied through grid computing. Furthermore the production of large quantities of Monte Carlo simulated data provides an ideal test bed for grid technologies and will drive their development. One important challenge when using the grid for data analysis is the ability to monitor transparently the large number of jobs that are being executed simultaneously at multiple remote sites. R-GMA is a monitoring and information management service for distributed resources based on the grid monitoring architecture of the Global Grid Forum. We have previously developed a system allowing us to test its performance under a heavy load while using few real grid resources. We present the latest results on this system running on the LCG 2 grid test bed using the LCG 2.6.0 middleware release. For a sustained load equivalent to 7 generations of 1000 simultaneous jobs, R-GMA was able to transfer all published messages and store them in a database for 98% of the individual jobs. The failures experienced were at the remote sites, rather than at the archiver's MON box as had been expected

    Analisis Subquery Processing pada Grid Database dengan Skema VO (Virtual Organization)

    Get PDF
    ABSTRAKSI: Seiring dengan perkembangan kebutuhan bisnis dan pengelolaan data yang kian membesar, diperlukanlah suatu teknik manajemen basis data yang handal dan harus dapat beroperasi secara efektif serta efisien di segala kondisi. Berangkat dari sinilah muncul ide pembuatan grid computing, yang selanjutnya memicu perkembangan di ranah basis data dengan terciptanya teknik grid database Berbeda dari teknik distributed umumnya dalam hal fleksibilitas dari kolaborasi resource, grid memungkinkan untuk menambah dan mengurangi sumberdaya komputasi maupun penyimpanan dari sistem tanpa mengakibatkan perubahan konfigurasi pada sistem secara keseluruhan. Berangkat dari kondisi yang dinamis tersebut, dari sisi database, beberapa paramater data dapat berubah selama prosesnya dan mungkin menjadi tidak akurat. Untuk itulah, dibutuhkan suatu skema pemrosesan query-subquery yang dapat beradaptasi dengan lingkungan grid tersebut. Pada penelitian tugas akhir ini, dibahas seputar salah satu skema pemrosesan yaitu VO (Virtual Organization). Pemrosesan dalam VO berlangsung dengan mendekomposisi queri (query global) menjadi queri-queri lokal(yang disebut sebagai subquery) menurut kebutuhan informasi berdasarkan proses bisnisnya, lalu mengalokasikannya ke node-node penyimpanan virtual VO yang sesuai. Sehingga, di sini diuji bagaimana VO dengan skema pemrosesan yang demikian, dapat diimplementasikan untuk lingkungan grid database yang dinamis seperti yang telah disampaikan sebelumnya. Sementara parameter uji yang digunakan yakni dari segi penjaminan availibility data termasuk reliability-nya serta toleransi terhadap kinerja waktu eksekusi query-subquery untuk setiap skenario kondisi failure node grid.Kata Kunci : grid database, grid, vo, virtual organization, subquery, virtualizationABSTRACT: Along with the growing of bussiness and data management needs, require a reliable database management technique that could be implemented in effective and efficient way for all environment condition. Hence, came an idea of making grid database which inspired by the concept of grid computing. Distinguished from conventional distributed technique by its focus on flexibility of large-scale resource sharing, grid enables server and storage resources to be added or removed from the system without requiring compicated configuration changes. In this such dynamic environment, some database parameters which are needed in query-subquery processing may be inacurate and changed during the process in grid database. So we need a query-subquery processing schema that could be adaptive to the grid database environment. For this reasons, the paper brings forward a subquery processing model caled VO (virtual organization). Processing in VO schema decomposed query (here after just called global query) into local query (called subquery) and assign the certain subquery to certain node to be finished. Finally, the paper tests the schema through experiment to see if it is match for the dynamic grid database environment. Herein, the paper uses availibility data, reliability, and query-subquery execution time parameter for analyzing VO‟s tolerancy in each grid node failure scenario.Keyword: grid database, grid, vo, virtual organization, subquery, virtualization

    Prototyping Virtual Data Technologies in ATLAS Data Challenge 1 Production

    Full text link
    For efficiency of the large production tasks distributed worldwide, it is essential to provide shared production management tools comprised of integratable and interoperable services. To enhance the ATLAS DC1 production toolkit, we introduced and tested a Virtual Data services component. For each major data transformation step identified in the ATLAS data processing pipeline (event generation, detector simulation, background pile-up and digitization, etc) the Virtual Data Cookbook (VDC) catalogue encapsulates the specific data transformation knowledge and the validated parameters settings that must be provided before the data transformation invocation. To provide for local-remote transparency during DC1 production, the VDC database server delivered in a controlled way both the validated production parameters and the templated production recipes for thousands of the event generation and detector simulation jobs around the world, simplifying the production management solutions.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 5 pages, 3 figures, pdf. PSN TUCP01

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
    • 

    corecore