9,232 research outputs found

    Importing Vector Graphics: The grImport Package for R

    Get PDF
    This article describes an approach to importing vector-based graphical images into statistical software as implemented in a package called grImport for the R statistical computing environment. This approach assumes that an original image can be transformed into a PostScript format (i.e., the original image is in a standard vector graphics format such as PostScript, PDF, or SVG). The grImport package consists of three components: a function for converting PostScript files to an R-specific XML format; a function for reading the XML format into special Picture objects in R; and functions for manipulating and drawing Picture objects. Several examples and applications are presented, including annotating a statistical plot with an imported logo and using imported images as plotting symbols.

    PDF/A standard for long term archiving

    Get PDF
    PDF/A is defined by ISO 19005-1 as a file format based on PDF format. The standard provides a mechanism for representing electronic documents in a way that preserves their visual appearance over time, independent of the tools and systems used for creating or storing the files.Comment: 8 pages, exposed on 5th International Conference "Actualities and Perspectives on Hardware and Software" - APHS2009, Timisoara, Romani

    From XML to XML: The why and how of making the biodiversity literature accessible to researchers

    Get PDF
    We present the ABLE document collection, which consists of a set of annotated volumes of the Bulletin of the British Museum (Natural History). These follow our work on automating the markup of scanned copies of the biodiversity literature, for the purpose of supporting working taxonomists. We consider an enhanced TEI XML markup language, which is used as an intermediate stage in translating from the initial XML obtained from Optical Character Recognition to the target taXMLit. The intermediate representation allows additional information from external sources such as a taxonomic thesaurus to be incorporated before the final translation into taXMLit

    Research Articles in Simplified HTML: a Web-first format for HTML-based scholarly articles

    Get PDF
    Purpose. This paper introduces the Research Articles in Simplified HTML (or RASH), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework, a set of tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles submitted to the SAVE-SD 2015 and SAVE-SD 2016 workshops. Design. RASH has been developed aiming to: be easy to learn and use; share scholarly documents (and embedded semantic annotations) through the Web; support its adoption within the existing publishing workflow. Findings. The evaluation study confirmed that RASH is ready to be adopted in workshops, conferences, and journals and can be quickly learnt by researchers who are familiar with HTML. Research Limitations. The evaluation study also highlighted some issues in the adoption of RASH, and in general of HTML formats, especially by less technically savvy users. Moreover, additional tools are needed, e.g., for enabling additional conversions from/to existing formats such as OpenXML. Practical Implications. RASH (and its Framework) is another step towards enabling the definition of formal representations of the meaning of the content of an article, facilitating its automatic discovery, enabling its linking to semantically related articles, providing access to data within the article in actionable form, and allowing integration of data between papers. Social Implications. RASH addresses the intrinsic needs related to the various users of a scholarly article: researchers (focussing on its content), readers (experiencing new ways for browsing it), citizen scientists (reusing available data formally defined within it through semantic annotations), publishers (using the advantages of new technologies as envisioned by the Semantic Publishing movement). Value. RASH helps authors to focus on the organisation of their texts, supports them in the task of semantically enriching the content of articles, and leaves all the issues about validation, visualisation, conversion, and semantic data extraction to the various tools developed within its Framework

    Dynamic Web File Format Transformations with Grace

    Full text link
    Web accessible content stored in obscure, unpopular or obsolete formats represents a significant problem for digital preservation. The file formats that encode web content represent the implicit and explicit choices of web site maintainers at a particular point in time. Older file formats that have fallen out of favor are obviously a problem, but so are new file formats that have not yet been fully supported by browsers. Often browsers use plug-in software for displaying old and new formats, but plug-ins can be difficult to find, install and replicate across all environments that one may use. We introduce Grace, an http proxy server that transparently converts browser-incompatible and obsolete web content into web content that a browser is able to display without the use of plug-ins. Grace is configurable on a per user basis and can be expanded to provide an array of conversion services. We illustrate how the Grace prototype transforms several image formats (XBM, PNG with various alpha channels, and JPEG 2000) so they are viewable in Internet Explorer.Comment: 12 pages, 9 figure

    Automatic generation of audio content for open learning resources

    Get PDF
    This paper describes how digital talking books (DTBs) with embedded functionality for learners can be generated from content structured according to the OU OpenLearn schema. It includes examples showing how a software transformation developed from open source components can be used to remix OpenLearn content, and discusses issues concerning the generation of synthesised speech for educational purposes. Factors which may affect the quality of a learner's experience with open educational audio resources are identified, and in conclusion plans for testing the effect of these factors are outlined

    PDF/A-3u as an archival format for Accessible mathematics

    Full text link
    Including LaTeX source of mathematical expressions, within the PDF document of a text-book or research paper, has definite benefits regarding `Accessibility' considerations. Here we describe three ways in which this can be done, fully compatibly with international standards ISO 32000, ISO 19005-3, and the forthcoming ISO 32000-2 (PDF 2.0). Two methods use embedded files, also known as `attachments', holding information in either LaTeX or MathML formats, but use different PDF structures to relate these attachments to regions of the document window. One uses structure, so is applicable to a fully `Tagged PDF' context, while the other uses /AF tagging of the relevant content. The third method requires no tagging at all, instead including the source coding as the /ActualText replacement of a so-called `fake space'. Information provided this way is extracted via simple Select/Copy/Paste actions, and is available to existing screen-reading software and assistive technologies.Comment: This is a post-print version of original in volume: S.M. Watt et al. (Eds.): CICM 2014, LNAI 8543, pp.184-199, 2014; available at http://link.springer.com/search?query=LNAI+8543, along with supplementary PDF. This version, with supplement as attachment, is enriched to validate as PDF/A-3u modulo an error in white-space handling in the pdfTeX version used to generate i

    An integrated approach to preparing, publishing, presenting and preserving theses

    Get PDF
    [Abstract]: This paper describes progress on a project funded by the Australian government to create Free software; the Integrated Content Environment for research and scholarship (ICE-RS). ICE-RS is a multi-faceted project which will add value to finished theses by making them available in both HTML and PDF, as well as providing a mechanism for packaging multimedia theses. The project will also concentrate on providing services for thesis production, with version control, automated backup and collaboration services. The paper begins with the established content management system that is the basis for the project, ICE-RS , originally developed to create courseware packages. ICE includes distributed, version controlled collaboration, using word processing software and works on multiple platforms, with standard document formats. We survey other approaches to content authoring and publishing for ETDs. We showcase exploratory work on integration of the thesis writing process with Institutional Repository software including publishing theses in both PDF and HTML with preservation and descriptive metadata. The presentation will include demonstrations of thesis production at all stages of development from proposal to completion. In a more speculative vein, we will discuss opportunities for institutions to provide new levels of support for candidates via automated thesis “dashboard” progress reports, supervisor and examiner annotation and comment and support for copyright considerations as early as possible in the process

    Visualising computational intelligence through converting data into formal concepts

    Get PDF
    • …
    corecore