2,254 research outputs found

    Designing an automated prototype tool for preservation quality metadata extraction for ingest into digital repository

    Get PDF
    We present a viable framework for the automated extraction of preservation quality metadata, which is adjusted to meet the needs of, ingest to digital repositories. It has three distinctive features: wide coverage, specialisation and emphasis on quality. Wide coverage is achieved through the use of a distributed system of tool repositories, which helps to implement it over a broad range of document object types. Specialisation is maintained through the selection of the most appropriate metadata extraction tool for each case based on the identification of the digital object genre. And quality is sustained by introducing control points at selected stages of the workflow of the system. The integration of these three features as components in the ingest of material into digital repositories is a defining step ahead in the current quest for improved management of digital resources

    Developing a comprehensive framework for multimodal feature extraction

    Full text link
    Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions---ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package's architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability

    3rd EGEE User Forum

    Get PDF
    We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum

    Software Citation Implementation Challenges

    Get PDF
    The main output of the FORCE11 Software Citation working group (https://www.force11.org/group/software-citation-working-group) was a paper on software citation principles (https://doi.org/10.7717/peerj-cs.86) published in September 2016. This paper laid out a set of six high-level principles for software citation (importance, credit and attribution, unique identification, persistence, accessibility, and specificity) and discussed how they could be used to implement software citation in the scholarly community. In a series of talks and other activities, we have promoted software citation using these increasingly accepted principles. At the time the initial paper was published, we also provided guidance and examples on how to make software citable, though we now realize there are unresolved problems with that guidance. The purpose of this document is to provide an explanation of current issues impacting scholarly attribution of research software, organize updated implementation guidance, and identify where best practices and solutions are still needed

    Semantic Services Grid in Flood-forecasting Simulations

    Get PDF
    Flooding in the major river basins of Central Europe is a recurrent event affecting many countries. Almost every year, it takes away lives and causes damage to infrastructure, agricultural and industrial production, and severely affects socio-economic development. Recurring floods of the magnitude and frequency observed in this region is a significant impediment, which requires rapid development of more flexible and effective flood-forecasting systems. In this paper we present design and development of the flood-forecasting system based on the Semantic Grid services. We will highlight the corresponding architecture, discovery and composition of services into workflows and semantic tools supporting the users in evaluating the results of the flood simulations. We will describe in detail the challenges of the flood-forecasting application and corresponding design and development of the service-oriented model, which is based on the well known Web Service Resource Framework (WSRF). Semantic descriptions of the WSRF services will be presented as well as the architecture, which exploits semantics in the discovery and composition of services. Further, we will demonstrate how experience management solutions can help in the process of service discovery and user support. The system provides a unique bottom-up approach in the Semantic Grids by combining the advances of semantic web services and grid architectures

    Web Engineering for Workflow-based Applications: Models, Systems and Methodologies

    Get PDF
    This dissertation presents novel solutions for the construction of Workflow-based Web applications: The Web Engineering DSL Framework, a stakeholder-oriented Web Engineering methodology based on Domain-Specific Languages; the Workflow DSL for the efficient engineering of Web-based Workflows with strong stakeholder involvement; the Dialog DSL for the usability-oriented development of advanced Web-based dialogs; the Web Engineering Reuse Sphere enabling holistic, stakeholder-oriented reuse
    • …
    corecore