18 research outputs found
Experiences with deploying legacy code applications as grid services using GEMLCA
One of the biggest obstacles in the wide-spread industrial take-up of Grid technology is the existence of a large amount of legacy code programs that is not accessible as Grid Services. On top of that, Grid technology challenges the user in order to intuitively interconnect and utilize resources in a friendly environment. This paper describes how legacy code applications were transformed into Grid Services using GEMLCA providing a user-friendly high-level Grid environment for deployment, and running them through the P-GRADE Grid portal. GEMLCA enables the use of legacy code programs as Grid services without modifying the original code. Using the P-GRADE Grid portal with GEMLCA it is possible to deploy legacy code applications as Grid services and use them in the creation and execution of complex workflows. This environment is tested by deploying and executing several legacy code applications on different sites of the UK e-Science OGSA testbed. © Springer-Verlag Berlin Heidelberg 2005
Legacy code support for production grids
In order to improve reliability and to deal with the high complexity of existing middleware solutions, today's production Grid systems restrict the services to be deployed on their resources. On the other hand end-users require a wide range of value added services to fully utilize these resources. This paper describes a solution how legacy code support is offered as third party service for production Grids. The introduced solution, based on the Grid Execution Management for Legacy Code Architecture (GEMLCA), do not require the deployment of additional applications on the Grid resources, or any extra effort from Grid system administrators. The implemented solution was successfully connected to and demonstrated on the UK National Grid Service. © 2005 IEEE
Making distributed computing infrastructures interoperable and accessible for e-scientists at the level of computational workflows
As distributed computing infrastructures evolve, and as their take up by user communities is growing, the importance of making different types of infrastructures based on a heterogeneous set of middleware interoperable is becoming crucial. This PhD submission, based on twenty scientific publications, presents a unique solution to the challenge of the seamless interoperation of distributed computing infrastructures at the level of workflows.
The submission investigates workflow level interoperation inside a particular workflow system (intra-workflow interoperation), and also between different workflow solutions (inter-workflow interoperation). In both cases the interoperation of workflow component execution and the feeding of data into these components workflow components are considered.
The invented and developed framework enables the execution of legacy applications and grid jobs and services on multiple grid systems, the feeding of data from heterogeneous file and data storage solutions to these workflow components, and the embedding of non-native workflows to a hosting meta-workflow. Moreover, the solution provides a high level user interface that enables e-scientist end-users to conveniently access the interoperable grid solutions without requiring them to study or understand the technical details of the underlying infrastructure. The candidate has also developed an application porting methodology that enables the systematic porting of applications to interoperable and interconnected grid infrastructures, and facilitates the exploitation of the above technical framework
A Novel Mechanism for Gridification of Compiled Java Applications
Exploiting Grids intuitively requires developers to alter their applications, which calls for expertise on Grid programming. Gridification tools address this problem by semi-automatically making user applications to be Grid-aware. However, most of these tools produce monolithic Grid applications in which common tuning mechanisms (e.g. parallelism) are not applicable, and do not reuse existing Grid middleware services. We propose BYG (BYtecode Gridifier), a gridification tool that relies on novel bytecode rewriting techniques to parallelize and easily execute existing applications via Grid middlewares. Experiments performed by using several computing intensive applications on a cluster and a simulated wide-area Grid suggest that our techniques are effective while staying competitive compared to programmatically using such services for gridifying applications
Grid application meta-repository system
As one of the most popular forms of distributed computing technology, Grid brings together different scientific communities that are able to deploy, access, and run complex applications with the help of the enormous computational and storage
power offered by the Grid infrastructure.
However as the number of Grid applications has been growing steadily in recent
years, they are now stored on a multitude of different repositories, which remain specific to each Grid. At the time this research was carried out there were no two well-known Grid application repositories sharing the same structure, same
implementation, same access technology and methods, same communication
protocols, same security system or same application description language used for application descriptions. This remained a great limitation for Grid users, who were bound to work on only one specific repository, and also presented a significant
limitation in terms of interoperability and inter-repository access. The research presented in this thesis provides a solution to this problem, as well as to several other related issues that have been identified while investigating these areas of Grid.
Following a comprehensive review of existing Grid repository capabilities, I defined the main challenges that need to be addressed in order to make Grid repositories more versatile and I proposed a solution that addresses these challenges. To this end, I designed a new Grid repository (GAMRS – Grid Application Meta-Repository
System), which includes a novel model and architecture, an improved application
description language and a matchmaking system. After implementing and testing
this solution, I have proved that GAMRS marks an improvement in Grid repository
systems. Its new features allow for the inter-connection of different Grid
repositories; make applications stored on these repositories visible on the web; allow for the discovery of similar or identical applications stored in different Grid repositories; permit the exchange and re-usage of application and applicationrelated objects between different repositories; and extend the use of applications
stored on Grid repositories to other distributed environments, such as virtualized cluster-on-demand and cloud computing
Interoperability of heterogeneous large-scale scientific workflows and data resources
Workflow allows e-Scientists to express their experimental processes in a structured
way and provides a glue to integrate remote applications. Since Grid provides an
enormously large amount of data and computational resources, executing workflows
on the Grid results in significant performance improvement. Several workflow management
systems, which are widely used by different scientific communities, were
developed for various purposes. Therefore, they differ in several aspects.
This thesis outlines two major problems of existing workflow systems: workflow
interoperability and data access. On the one hand, existing workflow systems are
based on different technologies. Therefore, to achieve interoperability between their
workflows at any level is a challenging task. In spite of the fact that there is a clear
demand for interoperable workflows, for example, to enable scientists to share workflows,
to leverage existing work of others, and to create multi-disciplinary workflows;
currently, there are only limited, ad-hoc workflow interoperability solutions available
for scientists. Existing solutions only realise workflow interoperability between
a small set of workflow systems and do not consider performance issues that arise
in the case of large-scale (computational and/or data intensive) scientific workflows.
Scientific workflows are typically computation and/or data intensive and are executed
in a distributed environment to speed up their execution time. Therefore,
their performance is a key issue. Existing interoperability solutions bottleneck the
communication between workflows in most scenarios dramatically increasing execution time. On the other hand, many scientific computational experiments are based
on data that reside in data resources which can be of different types and vendors.
Many workflow systems support access to limited subsets of such data resources
preventing data level workflow interoperation between different systems. Therefore,
there is a demand for a general solution that provides access to a wide range of data
resources of different types and vendors. If such a solution is general, in the sense
that it can be adopted by several workflow systems, then it also enables workflows
of different systems to access the same data resources and therefore interoperate at
data level. Note that data semantics are out of the scope of this work. For the
same reasons as described above, the performance characteristics of such a solution
are inevitably important. Although in terms of functionality, there are solutions
which could be adopted by workflow systems for this purpose, they provide poor
performance. For that reason, they did not gain wide acceptance by the scientific
workflow community.
Addressing these issues, a set of architectures is proposed to realise heterogeneous
data access and heterogeneous workflow execution solutions. The primary goal was
to investigate how such solutions can be implemented and integrated with workflow
systems. The secondary aim was to analyse how such solutions can be implemented
and utilised by single applications
Foundations of efficient virtual appliance based service deployments
The use of virtual appliances could provide a flexible solution to services
deployment. However, these solutions suffer from several disadvantages: (i)
the slow deployment time of services in virtual machines, and (ii) virtual appliances crafted by developers tend to be inefficient for deployment purposes.
Researchers target problem (i) by advancing virtualization technologies or
by introducing virtual appliance caches on the virtual machine monitor hosts.
Others aim at problem (ii) by providing solutions for virtual appliance construction, however these solutions require deep knowledge about the service
dependencies and its deployment process.
This dissertation aids problem (i) with a virtual appliance distribution
technique that first identifies appliance parts and their internal dependencies. Then based on service demand it efficiently distributes the identified
parts to virtual appliance repositories. Problem (ii) is targeted with the Automated Virtual appliance creation Service (AVS) that can extract and publish
an already deployed service by the developer. This recently acquired virtual
appliance is optimized for service deployment time with the proposed
virtual appliance optimization facility that utilizes active fault injection to
remove the non-functional parts of the appliance. Finally, the investigation
of appliance distribution and optimization techniques resulted the definition
of the minimal manageable virtual appliance that is capable of updating and
configuring its executor virtual machine.
The deployment time reduction capabilities of the proposed techniques
were measured with several services provided in virtual appliances on three
cloud infrastructures. The appliance creation capabilities of the AVS are compared to the already available virtual appliances offered by the various online
appliance repositories. The results reveal that the introduced techniques
significantly decrease the deployment time of virtual appliance based deployment systems. As a result these techniques alleviated one of the major
obstacles before virtual appliance based deployment systems