Search CORE

234 research outputs found

D.2.1.1 Consolidated Requirements Report, Roadmap and SIMDAT Infrastructure Design

Author: Boniface M. J.
Ferris J.
Kruger M.
Surridge M.
Tsasakou S. K.
Publication venue: s.n.
Publication date: 01/04/2005
Field of study

Proof-of-Concept Application - Annual Report Year 2

Author: Ardaiz Oscar
Chacin Pablo
Chao Isaac
Cruellas Juan Carlos
Eymann Torsten
Freitag Felix
Joita Liviu
Medina Manuel
Navarro Leandro
Rana Omer F.
Valero Miguel
Publication venue
Publication date
Field of study

This document first gives an introduction to Application Layer Networks and subsequently presents the catallactic resource allocation model and its integration into the middleware architecture of the developed prototype. Furthermore use cases for employed service models in such scenarios are presented as general application scenarios as well as two very detailed cases: Query services and Data Mining services. This work concludes by describing the middleware implementation and evaluation as well as future work in this area. --Grid Computing

Research Papers in Economics

Distributed data mining in grid computing environments

Author: Cannataro
Cannataro
Fernandez-Baca
Kevin Lü
Kwok
Ping Luo
Qing He
Zhongzhi Shi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The official published version of this article can be found at the link below.The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and can be modelled only by a Directed Acyclic Graph (DAG) with multiple data entries. Motivated by the need for a practical solution of the Grid scheduling problem for the DDM workflow, this paper proposes a novel two-phase scheduling framework, including External Scheduling and Internal Scheduling, on a two-level Grid architecture (InterGrid, IntraGrid). Currently a DM IntraGrid, named DMGCE (Data Mining Grid Computing Environment), has been developed with a dynamic scheduling framework for competitive DAGs in a heterogeneous computing environment. This system is implemented in an established Multi-Agent System (MAS) environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. Practical classification problems from oil well logging analysis are used to measure the system performance. The detailed experiment procedure and result analysis are also discussed in this paper

Crossref

Brunel University Research Archive

Peer-to-Peer Metadata Management for Knowledge Discovery Applications in Grids

Author: Antoniu Gabriel
Congiusta Antonio
Monnet Sébastien
Talia Domenico
Trunfio Paolo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2007
Field of study

Computational Grids are powerful platforms gathering computational power and storage space from thousands of geographically distributed resources. The applications running on such platforms need to efﬁciently and reliably access the various and heterogeneous distributed resources they offer. This can be achieved by using metadata information describing all available resources. It is therefore crucial to provide efﬁcient metadata management architectures and frameworks. In this paper we describe the design of a Grid metadata management service. We focus on a particular use case: the Knowledge Grid architecture which provides high-level Grid services for distributed knowledge discovery applications. Taking advantage of an existing Grid data-sharing service, namely JuxMem, the proposed solution lies at the border between peer-to-peer systems and Web services

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Architecture of the Grid Services Toolkit for Process Data Processing

Author: Gemmeke H.
Hartmann V.
Jejkal T.
Müller T.
Stotzka R.
Sutter M.
Publication venue
Publication date: 02/05/2007
Field of study

Grid is a rapidly growing new technology that will provide easy access to huge amounts of computer resources, both hardware and software. As these resources become available soon, more and more scientific users are interested in benefiting from them. At this time the main problem accessing the Grid is that scientific users usually need big knowledge of Grid methods and technologies besides their own field of research. To fill the gap between high-level scientific Grid users and low-level functions in Grid environments the Grid Services Toolkit (GST) is developed at the IPE. Aimed to simplify and accelerate the development of parallelized scientific Grid applications, the GST is based on Web services extended by a rich client API. It is especially designed for the field of process data processing providing database access and management, common methods of statistical data analysis and project specific methods

MPG.PuRe

Recommended from our members

Grid-based semantic integration of heterogeneous data resources: Implementation on a HealthGrid

Author: Naseer Aisha
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.The semantic integration of geographically distributed and heterogeneous data resources still remains a key challenge in Grid infrastructures. Today's mainstream Grid technologies hold the promise to meet this challenge in a systematic manner, making data applications more scalable and manageable. The thesis conducts a thorough investigation of the problem, the state of the art, and the related technologies, and proposes an Architecture for Semantic Integration of Data Sources (ASIDS) addressing the semantic heterogeneity issue. It defines a simple mechanism for the interoperability of heterogeneous data sources in order to extract or discover information regardless of their different semantics. The constituent technologies of this architecture include Globus Toolkit (GT4) and OGSA-DAI (Open Grid Service Architecture Data Integration and Access) alongside other web services technologies such as XML (Extensive Markup Language). To show this, the ASIDS architecture was implemented and tested in a realistic setting by building an exemplar application prototype on a HealthGrid (pilot implementation). The study followed an empirical research methodology and was informed by extensive literature surveys and a critical analysis of the relevant technologies and their synergies. The two literature reviews, together with the analysis of the technology background, have provided a good overview of the current Grid and HealthGrid landscape, produced some valuable taxonomies, explored new paths by integrating technologies, and more importantly illuminated the problem and guided the research process towards a promising solution. Yet the primary contribution of this research is an approach that uses contemporary Grid technologies for integrating heterogeneous data resources that have semantically different. data fields (attributes). It has been practically demonstrated (using a prototype HealthGrid) that discovery in semantically integrated distributed data sources can be feasible by using mainstream Grid technologies, which have been shown to have some Significant advantages over non-Grid based approaches

Brunel University Research Archive

Concurrent software architectures for exploratory data analysis

Author: Demsar Janez
Staric Anze
Zupan Blaz
Publication venue
Publication date: 01/01/2015
Field of study

Decades ago, increased volume of data made manual analysis obsolete and prompted the use of computational tools with interactive user interfaces and rich palette of data visualizations. Yet their classic, desktop-based architectures can no longer cope with the ever-growing size and complexity of data. Next-generation systems for explorative data analysis will be developed on client–server architectures, which already run concurrent software for data analytics but are not tailored to for an engaged, interactive analysis of data and models. In explorative data analysis, the key is the responsiveness of the system and prompt construction of interactive visualizations that can guide the users to uncover interesting data patterns. In this study, we review the current software architectures for distributed data analysis and propose a list of features to be included in the next generation frameworks for exploratory data analysis. The new generation of tools for explorative data analysis will need to address integrated data storage and processing, fast prototyping of data analysis pipelines supported by machine-proposed analysis workflows, preemptive analysis of data, interactivity, and user interfaces for intelligent data visualizations. The systems will rely on a mixture of concurrent software architectures to meet the challenge of seamless integration of explorative data interfaces at client site with management of concurrent data mining procedures on the servers

Concurrent software architectures for exploratory data analysis

Author: Demsar Janez
Staric Anze
Zupan Blaz
Publication venue
Publication date: 01/01/2015
Field of study

Repository of the University of Ljubljana

ePrints.FRI