894 research outputs found
Proposal for an IMLS Collection Registry and Metadata Repository
The University of Illinois at Urbana-Champaign proposes to design, implement, and research a collection-level registry and item-level metadata repository service that will aggregate information about digital collections and items of digital content created using funds from Institute of Museum and Library Services (IMLS) National Leadership Grants. This work will be a collaboration by the University Library and the Graduate School of Library and Information Science. All extant digital collections initiated or augmented under IMLS aegis from 1998 through September 30, 2005 will be included in the proposed collection registry. Item-level metadata will be harvested from collections making such content available using the Open Archives Initiative Protocol for Metadata Harvesting (OAI PMH). As part of this work, project personnel, in cooperation with IMLS staff and grantees, will define and document appropriate metadata schemas, help create and maintain collection-level metadata records, assist in implementing OAI compliant metadata provider services for dissemination of item-level metadata records, and research potential benefits and issues associated with these activities. The immediate outcomes of this work will be the practical demonstration of technologies that have the potential to enhance the visibility of IMLS funded online exhibits and digital library collections and improve discoverability of items contained in these resources. Experience gained and research conducted during this project will make clearer both the costs and the potential benefits associated with such services. Metadata provider and harvesting service implementations will be appropriately instrumented (e.g., customized anonymous transaction logs, online questionnaires for targeted user groups, performance monitors). At the conclusion of this project we will submit a final report that discusses tasks performed and lessons learned, presents business plans for sustaining registry and repository services, enumerates and summarizes potential benefits of these services, and makes recommendations regarding future implementations of these and related intermediary and end user interoperability services by IMLS projects.unpublishednot peer reviewe
A Multidisciplinary Approach to the Reuse of Open Learning Resources
Educational standards are having a significant impact on e-Learning. They allow for better exchange of information among different organizations and institutions. They simplify reusing and repurposing learning materials. They give teachers the possibility of personalizing them according to the student’s background and learning speed. Thanks to these standards, off-the-shelf content can be adapted to a particular student cohort’s context and learning needs. The same course content can be presented in different languages. Overall, all the parties involved in the learning-teaching process (students, teachers and institutions) can benefit from these standards and so online education can be improved. To materialize the benefits of standards, learning resources should be structured according to these standards. Unfortunately, there is the problem that a large number of existing e-Learning materials lack the intrinsic logical structure required, and further, when they have the structure, they are not encoded as required. These problems make it virtually impossible to share these materials. This thesis addresses the following research question: How to make the best use of existing open learning resources available on the Internet by taking advantage of educational standards and specifications and thus improving content reusability?In order to answer this question, I combine different technologies, techniques and standards that make the sharing of publicly available learning resources possible in innovative ways. I developed and implemented a three-stage tool to tackle the above problem. By applying information extraction techniques and open e-Learning standards to legacy learning resources the tool has proven to improve content reusability. In so doing, it contributes to the understanding of how these technologies can be used in real scenarios and shows how online education can benefit from them. In particular, three main components were created which enable the conversion process from unstructured educational content into a standard compliant form in a systematic and automatic way. An increasing number of repositories with educational resources are available, including Wikiversity and the Massachusetts Institute of Technology OpenCourseware. Wikivesity is an open repository containing over 6,000 learning resources in several disciplines and for all age groups [1]. I used the OpenCourseWare repository to evaluate the effectiveness of my software components and ideas. The results show that it is possible to create standard compliant learning objects from the publicly available web pages, improving their searchability, interoperability and reusability
The CorDis Corpus Mark-up and Related Issues
CorDis is a large, XML, TEI-conformant, POS-tagged, multimodal, multigenre corpus representing a significant portion of the political and media discourse on the 2003 Iraqi conflict. It was generated from different sub-corpora which had been assembled by various research groups, ranging from official transcripts of Parliamentary sessions, both in the US and the UK, to the transcripts of the Hutton Inquiry, from American and British newspaper coverage of the conflict to White House press briefings and to transcriptions of American and British TV news programmes. The heterogeneity of the data, the specificity of the genres and the diverse discourse analytical purposes of different groups had led to a wide range of coding strategies being employed to make textual and meta-textual information retrievable.
The main purpose of this paper is to show the process of harmonisation and integration whereby a loose collection of texts has become a stable architecture. The TEI proved a valid instrument to achieve standardisation of mark-up. The guidelines provide for a hierarchical organisation which gives the corpus a sound structure favouring replicability and enhancing the reliability of research. In discussing some examples of the problems encountered in the annotation, we will deal with issues like consistency and re-usability, and will examine the constraints imposed on data handling by specific research objectives. Examples include the choice to code the same speakers in different ways depending on the various (institutional) roles they may assume throughout the corpus, the distinction between quotations of spoken or written discourse and quotations read aloud in the course of a spoken text, and the segmentation of portions of news according to participants interaction and use of camera/voiceover
Semantic technologies: from niche to the mainstream of Web 3? A comprehensive framework for web Information modelling and semantic annotation
Context: Web information technologies developed and applied in the last decade
have considerably changed the way web applications operate and have
revolutionised information management and knowledge discovery. Social
technologies, user-generated classification schemes and formal semantics have a
far-reaching sphere of influence. They promote collective intelligence, support
interoperability, enhance sustainability and instigate innovation.
Contribution: The research carried out and consequent publications follow the
various paradigms of semantic technologies, assess each approach, evaluate its
efficiency, identify the challenges involved and propose a comprehensive framework for web information modelling and semantic annotation, which is the thesis’ original contribution to knowledge. The proposed framework assists web information
modelling, facilitates semantic annotation and information retrieval, enables system interoperability and enhances information quality.
Implications: Semantic technologies coupled with social media and end-user
involvement can instigate innovative influence with wide organisational implications that can benefit a considerable range of industries. The scalable and sustainable business models of social computing and the collective intelligence of organisational social media can be resourcefully paired with internal research and knowledge from interoperable information repositories, back-end databases and legacy systems.
Semantified information assets can free human resources so that they can be used to better serve business development, support innovation and increase productivity
Recommended from our members
Intermediary XML schemas
The methodology of intermediary XML schemas is introduced and its application to complex metadata environments is explored. Intermediary schemas are designed to mediate to other ‘referent’ schemas: instances conforming to these are not generally intended for dissemination but must usually be realized by XSLT transformations for delivery. In some cases, these schemas may also generate instances conforming to themselves. Three subsidiary methods of this methodology are introduced. The first is application-specific schemas that act as intermediaries to established schemas which are problematic by virtue of their over-complexity or flexibility. The second employs the METS packaging standard as a template for navigating instances of a complex schema by defining an abstract map of its instances. The third employs the METS structural map to define templates or conceptual models from which instances of metadata for complex applications may be realized by XSLT transformations. The first method is placed in the context of earlier approaches to semantic interoperability such as crosswalks, switching across, derivation and application profiles. The second is discussed in the context of such methods for mapping complex objects as OAI-ORE and the Fedora Content Model Architecture. The third is examined in relation to earlier approaches to templating within XML architectures. The relevance of these methods to contemporary research is discussed in three areas: digital ecosystems, archival description and Linked Open Data in digital asset management and preservation. Their relevance to future research is discussed in the form of suggested enhancements to each, a possible synthesis of the second and third to overcome possible problems of interoperability presented by the first, and their potential role in future developments in digital preservation. This methodology offers an original approach to resolving issues of interoperability and the management of complex metadata environments; it significantly extends earlier techniques and does so entirely within XML architectures
A Description Driven Approach for Flexible Metadata Tracking
Evolving user requirements presents a considerable software engineering
challenge, all the more so in an environment where data will be stored for a
very long time, and must remain usable as the system specification evolves
around it. Capturing the description of the system addresses this issue since a
description-driven approach enables new versions of data structures and
processes to be created alongside the old, thereby providing a history of
changes to the underlying data models and enabling the capture of provenance
data. This description-driven approach is advocated in this paper in which a
system called CRISTAL is presented. CRISTAL is based on description-driven
principles; it can use previous versions of stored descriptions to define
various versions of data which can be stored in various forms. To demonstrate
the efficacy of this approach the history of the project at CERN is presented
where CRISTAL was used to track data and process definitions and their
associated provenance data in the construction of the CMS ECAL detector, how it
was applied to handle analysis tracking and data index provenance in the
neuGRID and N4U projects, and how it will be matured further in the CRISTAL-ISE
project. We believe that the CRISTAL approach could be invaluable in handling
the evolution, indexing and tracking of large datasets, and are keen to apply
it further in this direction.Comment: 10 pages and 3 figures. arXiv admin note: text overlap with
arXiv:1402.5753, arXiv:1402.576
Interoperability and FAIRness through a novel combination of Web technologies
Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved atthe level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs
Cooperation Patterns and Adaptation Patterns for Service-Based Inter-Organizational Workflows
International audienceModernization is an effective approach to making existing mainframe and distributed systems more responsive to business needs. SOA (service-oriented architecture) is an adequate paradigm that allows companies to tap into the business value in their current systems and position IT for rapid future changes to the business model. In our research works, we focus on the use of SOA to implement Inter- Organizational WorkFlows (IOWF). The goal is to take benefits from the advantages offered by the SOA paradigm like interoperability, reusability and flexibility in order to deal with workflow models easily adaptable, evolvable and reusable. This paper focuses on two specific architectures of IOWF which are the "chained execution" and the "subcontracting"; the first issue of this work is to define Service-Based Cooperation Patterns (SBCP) suitable to the two architectures considered. A SBCP is based on SOA; it is defined through three main dimensions: the distribution of services among the partner's sites, the control of instance execution and the structure of interaction between the workflows involved in the cooperation. The second issue of the paper consists of adaptation and evolution of IOWF process models obeying to the defined SBCP. Then, we state the main operations of adaptation that can be applied on these models; we focus on adaptation at process and interactional levels. Conformably to the three dimensions of SBCP, we define three classes of adaptation patterns: "service adaptation", "control flow adaptation" and "interaction adaptation" patterns. Also, we particularly distinguish some operations of adaptation called evolution of process models based on two perspectives: the expansion of the global functionality of the process and the expansion of the cooperation; we show that some evolutions are realized by reuse of existing IOWF models. For implementation, we consider IOWF process models specified with BPEL
- …