1,577 research outputs found

    Incorporating Semantics and Metadata as Part of the Article Authoring Process

    Get PDF
    The ongoing shift in the delivery of publications, and in the consumption of content, from print to digital presents an opportunity to streamline the publishing workflow and to optimize the authoring process with digital content as the primary output, including the capture of semantics and metadata as part of authoring and the preservation of this data to the archival copy of the document. In addition to the shift in how content is delivered and consumed, a significant development in the last few years has been the release of new versions of word processors with native file formats based on XML. The use of XML in the authoring file format, combined with extensibility in its content model, will enable a greater level of content semantics and metadata to be expressed directly by authors. The level of interoperability enabled by XML-based word processing file formats will make it possible to preserve the semantics and metadata as documents go through the submission and review process, make it through the publishing workflow and are ultimately archived, likely also in an XML based format. This article describes the design considerations and possible benefits of the Article Authoring Add-in for Word 2007 to the scholarly publishing community, in particular for workflows focused on the production of documents for digital delivery and consumption, as well as for the XML based archival of publications. The second Beta release of the add-in is available as a free download (http://research.microsoft.com/authoring), and it is currently being evaluated by the scholarly publishing community, with the involvement of publishers, archives, information repositories, and early adopters. In addition to facilitating the creation of structured documents, and enabling semantics and metadata to be more easily captured during authoring, the add-in provides the ability to open and save files from Word 2007 into the XML format defined by the National Center for Biotechnology Information of the National Library of Medicine. The add-in extends the file format used by Word 2007, as well as its user interface, to tailor the authoring experience for the different audiences involved in the publishing workflow. As the add-in is adopted across multiple publications, authors will benefit from a consistent baseline experience, simplifying the authoring process and enabling a shift towards emphasising the expression of semantics over presentation by authors

    Connecting Authors and Repositories Through SWORD

    Get PDF
    4th International Conference on Open RepositoriesThis presentation was part of the session : Conference PresentationsDate: 2009-06-04 10:30 AM – 12:00 PMBy incorporating SWORD support into an add-in for Microsoft Word, it is now possible for authors to deposit articles to Information Repositories directly from their word processor. Furthermore, in order to simplify and make the submission process as transparent as possible, the SWORD related information can be incorporated into template files, so that all that is required from authors is to click a button. Additionally, since templates can incorporate semantic information, articles can be validated against the template as part of the submission process, enabling authors to correct errors prior to submission, which should result in a higher level of metadata and compliance of the content submitted to repositories. Also, through the add-in, author metadata can be gathered in a largely automated fashion, reducing duplication in data entry and author aggravation

    Mapping and Displaying Structural Transformations between XML and PDF

    Get PDF
    Documents are often marked up in XML-based tagsets to delineate major structural components such as headings, paragraphs, figure captions and so on, without much regard to their eventual displayed appearance. And yet these same abstract documents, after many transformations and 'typesetting' processes, often emerge in the popular format of Adobe PDF, either for dissemination or archiving. Until recently PDF has been a totally display-based document representation, relying on the underlying PostScript semantics of PDF. Early versions of PDF had no mechanism for retaining any form of abstract document structure but recent releases have now introduced an internal structure tree to create the so called 'Tagged PDF'. This paper describes the development of a plugin for Adobe Acrobat which creates a two-window display. In one window is shown an XML document original and in the other its Tagged PDF counterpart is seen, with an internal structure tree that, in some sense, matches the one seen in XML. If a component is highlighted in either window then the corresponding structured item, with any attendant text, is also highlighted in the other window. Important applications of correctly Tagged PDF include making PDF documents reflow intelligently on small screen devices and enabling them to be read out in correct reading order, via speech synthesiser software, for the visually impaired. By tracing structure transformation from source document to destination one can implement the repair of damaged PDF structure or the adaptation of an existing structure tree to an incrementally updated document

    Building Interoperable Vocabulary and Structures for Learning Objects

    Get PDF
    The structural, functional, and production views on learning objects influence metadata structure and vocabulary. We drew on these views and conducted a literature review and in-depth analysis of 14 learning objects and over 500 components in these learning objects to model the knowledge framework for a learning object ontology. The learning object ontology reported in this paper consists of 8 top-level classes, 28 classes at the second level, and 34 at the third level. Except class Learning object, all other classes have the three properties of preferred term, related term, and synonym. To validate the ontology, we conducted a query log analysis that focused on discovering what terms users have used at both conceptual and word levels. The findings show that the main classes in the ontology are either conceptually or linguistically similar to the top terms in the query log data. We built an Exercise Editor as an informal experiment to test its ability to be adopted in authoring tools. The main contribution of this project is in the framework for the learning object domain and methodology used to develop and validate an ontology

    Multimedia Annotation Interoperability Framework

    Get PDF
    Multimedia systems typically contain digital documents of mixed media types, which are indexed on the basis of strongly divergent metadata standards. This severely hamplers the inter-operation of such systems. Therefore, machine understanding of metadata comming from different applications is a basic requirement for the inter-operation of distributed Multimedia systems. In this document, we present how interoperability among metadata, vocabularies/ontologies and services is enhanced using Semantic Web technologies. In addition, it provides guidelines for semantic interoperability, illustrated by use cases. Finally, it presents an overview of the most commonly used metadata standards and tools, and provides the general research direction for semantic interoperability using Semantic Web technologies

    Augmenting applications with hyper media, functionality and meta-information

    Get PDF
    The Dynamic Hypermedia Engine (DHE) enhances analytical applications by adding relationships, semantics and other metadata to the application\u27s output and user interface. DHE also provides additional hypermedia navigational, structural and annotation functionality. These features allow application developers and users to add guided tours, personal links and sharable annotations, among other features, into applications. DHE runs as a middleware between the application user interface and its business logic and processes, in a n-tier architecture, supporting the extra functionalities without altering the original systems by means of application wrappers. DHE automatically generates links at run-time for each of those elements having relationships and metadata. Such elements are previously identified using a Relation Navigation Analysis. DHE also constructs more sophisticated navigation techniques not often found on the Web on top of these links. The metadata, links, navigation and annotation features supplement the application\u27s primary functionality. This research identifies element types, or classes , in the application displays. A mapping rule encodes each relationship found between two elements of interest at the class level . When the user selects a particular element, DHE instantiates the commands included in the rules with the actual instance selected and sends them to the appropriate destination system, which then dynamically generates the resulting virtual (i.e. not previously stored) page. DHE executes concurrently with these applications, providing automated link generation and other hypermedia functionality. DHE uses the extensible Markup Language (XMQ -and related World Wide Web Consortium (W3C) sets of XML recommendations, like Xlink, XML Schema, and RDF -to encode the semantic information required for the operation of the extra hypermedia features, and for the transmission of messages between the engine modules and applications. DHE is the only approach we know that provides automated linking and metadata services in a generic manner, based on the application semantics, without altering the applications. DHE will also work with non-Web systems. The results of this work could also be extended to other research areas, such as link ranking and filtering, automatic link generation as the result of a search query, metadata collection and support, virtual document management, hypermedia functionality on the Web, adaptive and collaborative hypermedia, web engineering, and the semantic Web

    Taxonomy for Humans or Computers? Cognitive Pragmatics for Big Data

    Get PDF
    Criticism of big data has focused on showing that more is not necessarily better, in the sense that data may lose their value when taken out of context and aggregated together. The next step is to incorporate an awareness of pitfalls for aggregation into the design of data infrastructure and institutions. A common strategy minimizes aggregation errors by increasing the precision of our conventions for identifying and classifying data. As a counterpoint, we argue that there are pragmatic trade-offs between precision and ambiguity that are key to designing effective solutions for generating big data about biodiversity. We focus on the importance of theory-dependence as a source of ambiguity in taxonomic nomenclature and hence a persistent challenge for implementing a single, long-term solution to storing and accessing meaningful sets of biological specimens. We argue that ambiguity does have a positive role to play in scientific progress as a tool for efficiently symbolizing multiple aspects of taxa and mediating between conflicting hypotheses about their nature. Pursuing a deeper understanding of the trade-offs and synthesis of precision and ambiguity as virtues of scientific language and communication systems then offers a productive next step for realizing sound, big biodiversity data services

    Third international workshop on Authoring of adaptive and adaptable educational hypermedia (A3EH), Amsterdam, 18-22 July, 2005

    Get PDF
    The A3EH follows a successful series of workshops on Adaptive and Adaptable Educational Hypermedia. This workshop focuses on models, design and authoring of AEH, on assessment of AEH, conversion between AEH and evaluation of AEH. The workshop has paper presentations, poster session and panel discussions
    • …
    corecore