1,391 research outputs found
Mapping and Displaying Structural Transformations between XML and PDF
Documents are often marked up in XML-based tagsets to delineate major structural components such as headings, paragraphs, figure captions and so on, without much regard to their eventual displayed appearance. And yet these same abstract documents, after many transformations and 'typesetting' processes, often emerge in the popular format of Adobe PDF, either for dissemination or archiving.
Until recently PDF has been a totally display-based document representation, relying on the underlying PostScript semantics of PDF. Early versions of PDF had no mechanism for retaining any form of abstract document structure but recent releases have now introduced an internal structure tree to create the so called 'Tagged PDF'.
This paper describes the development of a plugin for Adobe Acrobat which creates a two-window display. In one window is shown an XML document original and in the other its Tagged PDF counterpart is seen, with an internal structure tree that, in some sense, matches the one seen in XML. If a component is highlighted in either window then the corresponding structured item, with any attendant text, is also highlighted in the other window.
Important applications of correctly Tagged PDF include making PDF documents reflow intelligently on small screen devices and enabling them to be read out in correct reading order, via speech synthesiser software, for the visually impaired. By tracing structure transformation from source document to destination one can implement the repair of damaged PDF structure or the adaptation of an existing structure tree to an incrementally updated document
Development of Use Cases, Part I
For determining requirements and constructs appropriate for a Web query language, or in fact
any language, use cases are of essence. The W3C has published two sets of use cases for XML
and RDF query languages. In this article, solutions for these use cases are presented using
Xcerpt. a novel Web and Semantic Web query language that combines access to standard Web
data such as XML documents with access to Semantic Web metadata
such as RDF resource
descriptions with reasoning abilities and rules familiar from logicprogramming.
To the
best knowledge of the authors, this is the first in depth study of how to solve use cases for
accessing XML and RDF in a single language: Integrated access to data and metadata
has been
recognized by industry and academia as one of the key challenges in data processing for the
next decade. This article is a contribution towards addressing this challenge by demonstrating
along practical and recognized use cases the usefulness of reasoning abilities, rules, and
semistructured
query languages for accessing both data (XML) and metadata
(RDF)
The NASA Astrophysics Data System: Data Holdings
Since its inception in 1993, the ADS Abstract Service has become an
indispensable research tool for astronomers and astrophysicists worldwide. In
those seven years, much effort has been directed toward improving both the
quantity and the quality of references in the database. From the original
database of approximately 160,000 astronomy abstracts, our dataset has grown
almost tenfold to approximately 1.5 million references covering astronomy,
astrophysics, planetary sciences, physics, optics, and engineering. We collect
and standardize data from approximately 200 journals and present the resulting
information in a uniform, coherent manner. With the cooperation of journal
publishers worldwide, we have been able to place scans of full journal articles
on-line back to the first volumes of many astronomical journals, and we are
able to link to current version of articles, abstracts, and datasets for
essentially all of the current astronomy literature. The trend toward
electronic publishing in the field, the use of electronic submission of
abstracts for journal articles and conference proceedings, and the increasingly
prominent use of the World Wide Web to disseminate information have enabled the
ADS to build a database unparalleled in other disciplines.
The ADS can be accessed at http://adswww.harvard.eduComment: 24 pages, 1 figure, 6 tables, 3 appendice
The LaTeX project: A case study of open-source software
This is a case study of TeX, a typesetting software that was developed by Donald E. Knuth in the late 70's. Released with an open source license, it has become a reference in scientific publishing. TeX is now used to typeset and publish much of the world's scientific literature in physics and mathematics. This case study is part of a wider effort by academics to understand the open-source phenomenon. That development model is similar to the organization of the production of knowledge in academia; there is no set organization with a hierarchy, but free collaboration that is coordinated spontaneously and winds up generating complex products that are the property of all who can understand its functioning. The case study was led by gathering qualitative data via interviews with TeX developers and quantitative data on the TeX community -- the program's code, the software that is part of the TeX distribution, the newsgroups dedicated to the software, and many other indicators of the evolution and activity in that open-source project. The case study is aimed at economists who want to develop models to understand and analyze the open-source phenomenon. It is also geared towards policy-makers who would like to encourage or regulate open- source, and towards open-source developers who wonder what are the efficient strategies to make an open-source project successful.TeX, LaTeX, case study, open source, software, innovation, organisational structure, economic history, knowledge production, knowledge diffusion.
Special Libraries, Spring 1995
Volume 86, Issue 2https://scholarworks.sjsu.edu/sla_sl_1995/1001/thumbnail.jp
Recommendation for an interface system for product related computer data to enhance the Engineering Change Order/Preliminary Change Order function
The following document will explore product and information integration by demonstrating the potential economic, strategic, and technical benefits attainable in the Engineering Change Order/Preliminary Change Order function. Information is the foundation of today\u27s corporate enterprise. An organization\u27s success can depend on how effectively it identifies, manages and uses its information. As an organization grows or becomes more complex, the infrastructure of information becomes more complex. The management and distribution of information corporation wide becomes a key element in the strategic position of the organization in its given market
- âŠ