2,254 research outputs found
Designing an automated prototype tool for preservation quality metadata extraction for ingest into digital repository
We present a viable framework for the automated extraction of preservation quality metadata, which is adjusted to meet the needs of, ingest to digital repositories. It has three distinctive features: wide coverage, specialisation and emphasis on quality. Wide coverage is achieved through the use of a distributed system of tool repositories, which helps to implement it over a broad range of document object types. Specialisation is maintained through the selection of the most appropriate metadata extraction tool for each case based on the identification of the digital object genre. And quality is sustained by introducing control points at selected stages of the workflow of the system. The integration of these three features as components in the ingest of material into digital repositories is a defining step ahead in the current quest for improved management of digital resources
Developing a comprehensive framework for multimodal feature extraction
Feature extraction is a critical component of many applied data science
workflows. In recent years, rapid advances in artificial intelligence and
machine learning have led to an explosion of feature extraction tools and
services that allow data scientists to cheaply and effectively annotate their
data along a vast array of dimensions---ranging from detecting faces in images
to analyzing the sentiment expressed in coherent text. Unfortunately, the
proliferation of powerful feature extraction services has been mirrored by a
corresponding expansion in the number of distinct interfaces to feature
extraction services. In a world where nearly every new service has its own API,
documentation, and/or client library, data scientists who need to combine
diverse features obtained from multiple sources are often forced to write and
maintain ever more elaborate feature extraction pipelines. To address this
challenge, we introduce a new open-source framework for comprehensive
multimodal feature extraction. Pliers is an open-source Python package that
supports standardized annotation of diverse data types (video, images, audio,
and text), and is expressly with both ease-of-use and extensibility in mind.
Users can apply a wide range of pre-existing feature extraction tools to their
data in just a few lines of Python code, and can also easily add their own
custom extractors by writing modular classes. A graph-based API enables rapid
development of complex feature extraction pipelines that output results in a
single, standardized format. We describe the package's architecture, detail its
major advantages over previous feature extraction toolboxes, and use a sample
application to a large functional MRI dataset to illustrate how pliers can
significantly reduce the time and effort required to construct sophisticated
feature extraction workflows while increasing code clarity and maintainability
3rd EGEE User Forum
We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum
Software Citation Implementation Challenges
The main output of the FORCE11 Software Citation working group
(https://www.force11.org/group/software-citation-working-group) was a paper on
software citation principles (https://doi.org/10.7717/peerj-cs.86) published in
September 2016. This paper laid out a set of six high-level principles for
software citation (importance, credit and attribution, unique identification,
persistence, accessibility, and specificity) and discussed how they could be
used to implement software citation in the scholarly community. In a series of
talks and other activities, we have promoted software citation using these
increasingly accepted principles. At the time the initial paper was published,
we also provided guidance and examples on how to make software citable, though
we now realize there are unresolved problems with that guidance. The purpose of
this document is to provide an explanation of current issues impacting
scholarly attribution of research software, organize updated implementation
guidance, and identify where best practices and solutions are still needed
Semantic Services Grid in Flood-forecasting Simulations
Flooding in the major river basins of Central Europe is a recurrent event affecting many countries. Almost every year, it takes away lives and causes damage to infrastructure, agricultural and industrial production, and severely affects socio-economic development. Recurring floods of the magnitude and frequency observed in this region is a significant impediment, which requires rapid development of more flexible and effective flood-forecasting systems. In this paper we present design and development of the flood-forecasting system based on the Semantic Grid services. We will highlight the corresponding architecture, discovery and composition of services into workflows and semantic tools supporting the users in evaluating the results of the flood simulations. We will describe in detail the challenges of the flood-forecasting application and corresponding design and development of the service-oriented model, which is based on the well known Web Service Resource Framework (WSRF). Semantic descriptions of the WSRF services will be presented as well as the architecture, which exploits semantics in the discovery and composition of services. Further, we will demonstrate how experience management solutions can help in the process of service discovery and user support. The system provides a unique bottom-up approach in the Semantic Grids by combining the advances of semantic web services and grid architectures
Web Engineering for Workflow-based Applications: Models, Systems and Methodologies
This dissertation presents novel solutions for the construction of Workflow-based Web applications: The Web Engineering DSL Framework, a stakeholder-oriented Web Engineering methodology based on Domain-Specific Languages; the Workflow DSL for the efficient engineering of Web-based Workflows with strong stakeholder involvement; the Dialog DSL for the usability-oriented development of advanced Web-based dialogs; the Web Engineering Reuse Sphere enabling holistic, stakeholder-oriented reuse
- …