Search CORE

25,012 research outputs found

A Taxonomy of Workflow Management Systems for Grid Computing

Author: Buyya Rajkumar
Yu Jia
Publication venue
Publication date: 01/01/2005
Field of study

With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Físchlár-DiamondTouch: collaborative video searching on a table

Author: Foley Colum
Gurrin Cathal
Lee Hyowon
McGivney Sinéad
Smeaton Alan F.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2006
Field of study

In this paper we present the system we have developed for our participation in the annual TRECVid benchmarking activity, specically the system we have developed, Físchlár-DT, for participation in the interactive search task of TRECVid 2005. Our back-end search engine uses a combination of a text search which operates over the automatic speech recognised text, and an image search which uses low-level image features matched against video keyframes. The two novel aspects of our work are the fact that we are evaluating collaborative, team-based search among groups of users working together, and that we are using a novel touch-sensitive tabletop interface and interaction device known as the DiamondTouch to support this collaborative search. The paper summarises the backend search systems as well as presenting the interface we have developed, in detail

DCU Online Research Access Service

Grid-Brick Event Processing Framework in GEPS

Author: Almeida Nuno
Amorim Antonio
Fei Han
Pedro Luis
Trezentos Paulo
Villate Jaime E.
Publication venue
Publication date: 14/06/2003
Field of study

Experiments like ATLAS at LHC involve a scale of computing and data management that greatly exceeds the capability of existing systems, making it necessary to resort to Grid-based Parallel Event Processing Systems (GEPS). Traditional Grid systems concentrate the data in central data servers which have to be accessed by many nodes each time an analysis or processing job starts. These systems require very powerful central data servers and make little use of the distributed disk space that is available in commodity computers. The Grid-Brick system, which is described in this paper, follows a different approach. The data storage is split among all grid nodes having each one a piece of the whole information. Users submit queries and the system will distribute the tasks through all the nodes and retrieve the result, merging them together in the Job Submit Server. The main advantage of using this system is the huge scalability it provides, while its biggest disadvantage appears in the case of failure of one of the nodes. A workaround for this problem involves data replication or backup.Comment: 6 pages; document for CHEP'03 conferenc

arXiv.org e-Print Archive

CERN Document Server

Experiments on domain adaptation for English-Hindi SMT

Author: Haque Rejwanul
Naskar Sudip Kumar
van Genabith Josef
Way Andy
Publication venue
Publication date: 01/01/2009
Field of study

Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text and monolingual target language text. If a significant amount of out-of-domain data is added to the training data, the quality of translation can drop. On the other hand, training an SMT system on a small amount of training material for given indomain data leads to narrow lexical coverage which again results in a low translation quality. In this paper, (i) we explore domain-adaptation techniques to combine large out-of-domain training data with small-scale in-domain training data for English—Hindi statistical machine translation and (ii) we cluster large out-of-domain training data to extract sentences similar to in-domain sentences and apply adaptation techniques to combine clustered sub-corpora with in-domain training data into a unified framework, achieving a 0.44 absolute corresponding to a 4.03% relative improvement in terms of BLEU over the baseline

CiteSeerX

Irish Universities

DCU Online Research Access Service

Towards using web-crawled data for domain adaptation in statistical machine translation

Author: Giagkou Maria
Papavassiliou Vassilis
Pecina Pavel
Prokopidis Prokopis
Toral Antonio
Way Andy
Publication venue
Publication date: 30/05/2011
Field of study

This paper reports on the ongoing work focused on domain adaptation of statistical machine translation using domain-speciﬁc data obtained by domain-focused web crawling. We present a strategy for crawling monolingual and parallel data and their exploitation for testing, language modelling, and system tuning in a phrase--based machine translation framework. The proposed approach is evaluated on the domains of Natural Environment and Labour Legislation and two language pairs: English–French and English–Greek

DCU Online Research Access Service

Grid enabling legacy applications for scalability – Experiences of a production application on the UK NGS

Author: Fowler R
Pakhira A
Perring T
Sastry L
Publication venue
Publication date: 01/01/2005
Field of study

ePubs: the open archive for STFC research publications

The Digital Puglia Project: An Active Digital Library of Remote Sensing Data

Author: Aloisio Giovanni
Cafaro Massimo
Williams Roy
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1999
Field of study

The growing need of software infrastructure able to create, maintain and ease the evolution of scientific data, promotes the development of digital libraries in order to provide the user with fast and reliable access to data. In a world that is rapidly changing, the standard view of a digital library as a data repository specialized to a community of users and provided with some search tools is no longer tenable. To be effective, a digital library should be an active digital library, meaning that users can process available data not just to retrieve a particular piece of information, but to infer new knowledge about the data at hand. Digital Puglia is a new project, conceived to emphasize not only retrieval of data to the client's workstation, but also customized processing of the data. Such processing tasks may include data mining, filtering and knowledge discovery in huge databases, compute-intensive image processing (such as principal component analysis, supervised classification, or pattern matching) and on demand computing sessions. We describe the issues, the requirements and the underlying technologies of the Digital Puglia Project, whose final goal is to build a high performance distributed and active digital library of remote sensing data

CiteSeerX

Caltech Authors

Archivio Istituzionale della Ricerca- Università del Salento