154,229 research outputs found

    A Web services data analysis Grid

    Full text link
    The trend in large-scale scientific data analysis is to exploit compute, storage and other resources located at multiple sites, and to make those resources accessible to the scientist as if they were a single, coherent system. Web technologies driven by the huge and rapidly growing electronic commerce industry provide valuable components to speed the deployment of such sophisticated systems. Jefferson Lab, where several hundred terabytes of experimental data are acquired each year, is in the process of developing a web-based distributed system for data analysis and management. The essential aspects of this system are a distributed data grid (site independent access to experiment, simulation and model data) and a distributed batch system, augmented with various supervisory and management capabilities, and integrated using Java and XML-based web services

    Resource Management Services for a Grid Analysis Environment

    Get PDF
    Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimator service that have been designed and written using a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Grid-enabled physics analysis.Comment: 8 pages, 7 figures. Workshop on Web and Grid Services for Scientific Data Analysis at the Int Conf on Parallel Processing (ICPP05). Norway June 200

    Provenance-based trust for grid computing: Position Paper

    No full text
    Current evolutions of Internet technology such as Web Services, ebXML, peer-to-peer and Grid computing all point to the development of large-scale open networks of diverse computing systems interacting with one another to perform tasks. Grid systems (and Web Services) are exemplary in this respect and are perhaps some of the first large-scale open computing systems to see widespread use - making them an important testing ground for problems in trust management which are likely to arise. From this perspective, today's grid architectures suffer from limitations, such as lack of a mechanism to trace results and lack of infrastructure to build up trust networks. These are important concerns in open grids, in which "community resources" are owned and managed by multiple stakeholders, and are dynamically organised in virtual organisations. Provenance enables users to trace how a particular result has been arrived at by identifying the individual services and the aggregation of services that produced such a particular output. Against this background, we present a research agenda to design, conceive and implement an industrial-strength open provenance architecture for grid systems. We motivate its use with three complex grid applications, namely aerospace engineering, organ transplant management and bioinformatics. Industrial-strength provenance support includes a scalable and secure architecture, an open proposal for standardising the protocols and data structures, a set of tools for configuring and using the provenance architecture, an open source reference implementation, and a deployment and validation in industrial context. The provision of such facilities will enrich grid capabilities by including new functionalities required for solving complex problems such as provenance data to provide complete audit trails of process execution and third-party analysis and auditing. As a result, we anticipate that a larger uptake of grid technology is likely to occur, since unprecedented possibilities will be offered to users and will give them a competitive edge

    Architecture of the Grid Services Toolkit for Process Data Processing

    Get PDF
    Grid is a rapidly growing new technology that will provide easy access to huge amounts of computer resources, both hardware and software. As these resources become available soon, more and more scientific users are interested in benefiting from them. At this time the main problem accessing the Grid is that scientific users usually need big knowledge of Grid methods and technologies besides their own field of research. To fill the gap between high-level scientific Grid users and low-level functions in Grid environments the Grid Services Toolkit (GST) is developed at the IPE. Aimed to simplify and accelerate the development of parallelized scientific Grid applications, the GST is based on Web services extended by a rich client API. It is especially designed for the field of process data processing providing database access and management, common methods of statistical data analysis and project specific methods

    GEODE – Sharing Occupational Data Through The Grid

    Get PDF
    The ESRC funded Grid Enabled Occupational Data Environment (GEODE) project is conceived to facilitate and virtualise occupational data access through a grid environment. Through GEODE it is planned that occupational data users from the social sciences can access curated datasets, share micro datasets, and perform statistical analysis within a secure virtual community. The Michigan Data Documentation Initiative (DDI) is used to annotate the datasets with social science specific metadata to provide for better semantics and indexes. GEODE uses the Globus Toolkit and the Open Grid Service Architecture – Data Access and Integration (OGSA-DAI) software as the Grid middleware to provide data access and integration. Users access and use occupational data sets through a GEODE web portal. This portal interfaces with the Grid infrastructure and provides useroriented data searches and services. The processing of CAMSIS (Cambridge Social Interaction and Stratification) measures is used as an illustrative example of how GEODE provides services for linking occupational information. This paper provides an overview of the GEODE work and lessons learned in applying Grid technologies in this domain

    A catallactic market for data mining services.

    Get PDF
    We describe a Grid market for exchanging data mining services based on the Catallactic market mechanism proposed by von Hayek. This market mechanism allows selection between multiple instances of services based on operations required in a data mining task (such as data migration, data pre-processing and subsequently data analysis). Catallaxy is a decentralized approach, based on a “free market” mechanism, and is particularly useful when the number of market participants is large or when conditions within the market often change. It is therefore particularly suitable in Grid and peer-2-peer systems. The approach assumes that the service provider and user are not co-located, and require multiple message exchanges to carry out a data mining task. A market of J48-based decision tree algorithm instances, each implemented as a Web service, is used to demonstrate our approach. We have validated the feasibility of building catallactic data mining grid applications, and implemented a proof-of-concept application (Cat-COVITE) mapped to a Catallactic Grid Middleware.Peer Reviewe

    Image processing methods and architectures in diagnostic pathology.

    Get PDF
    Grid technology has enabled the clustering and the efficient and secure access to and interaction among a wide variety of geographically distributed resources such as: supercomputers, storage systems, data sources, instruments and special devices and services. Their main applications include large-scale computational and data intensive problems in science and engineering. General grid structures and methodologies for both software and hardware in image analysis for virtual tissue-based diagnosis has been considered in this paper. This methods are focus on the user level middleware. The article describes the distributed programming system developed by the authors for virtual slide analysis in diagnostic pathology. The system supports different image analysis operations commonly done in anatomical pathology and it takes into account secured aspects and specialized infrastructures with high level services designed to meet application requirements. Grids are likely to have a deep impact on health related applications, and therefore they seem to be suitable for tissue-based diagnosis too. The implemented system is a joint application that mixes both Web and Grid Service Architecture around a distributed architecture for image processing. It has shown to be a successful solution to analyze a big and heterogeneous group of histological images under architecture of massively parallel processors using message passing and non-shared memory

    Duomenų tyrybos sistemos, pagrįstos saityno paslaugomis

    Get PDF
    Straipsnis skirtas duomenų tyrybos, pagrįstos saityno paslaugomis, analizei. Apibrėžiamos pagrindinės su saityno paslaugomis susijusios sąvokos. Pristatomos paskirstytosios duomenų tyrybos galimybės bei jų įgyvendinimo priemonės – Grid, Hadoop. Atliekama duomenų tyrybos sistemų, pagrįstų saityno paslaugomis, analitinė apžvalga. Parenkami sistemų palyginimo kriterijai. Pagal šiuos kriterijus atliekama populiariausių duomenų tyrybos sistemų, pagrįstų saityno paslaugomis, lyginamoji analizė. Nustatoma, kurios sistemos įvertinamos geriausiai, o kurios neatitinka daugumos kriterijų.Data mining systems, based on Web services Olga Kurasova, Virginijus Marcinkevičius, Viktor Medvedev, Aurimas Rapečka SummaryIn the paper, data mining systems, based on web services, are analysed. The main notation related with web services is described. The possibilities of distributed data mining and their implementation tools – Grid, Hadoop are introduced. An analytical review of the data mining systems, based on web services, is provided. Some comparison criteria are selected. According to the criteria, a comparative analysis of the popular data mining systems, based on web services, is made. The paper illustrates, which systems are best for evaluating and which do not satisfy most of the criteria.11pt; line-height: 115%; font-family: Calibri, sans-serif;">&nbsp
    corecore