254 research outputs found

    The LaTeX project: A case study of open-source software

    Get PDF
    This is a case study of TeX, a typesetting software that was developed by Donald E. Knuth in the late 70's. Released with an open source license, it has become a reference in scientific publishing. TeX is now used to typeset and publish much of the world's scientific literature in physics and mathematics. This case study is part of a wider effort by academics to understand the open-source phenomenon. That development model is similar to the organization of the production of knowledge in academia; there is no set organization with a hierarchy, but free collaboration that is coordinated spontaneously and winds up generating complex products that are the property of all who can understand its functioning. The case study was led by gathering qualitative data via interviews with TeX developers and quantitative data on the TeX community -- the program's code, the software that is part of the TeX distribution, the newsgroups dedicated to the software, and many other indicators of the evolution and activity in that open-source project. The case study is aimed at economists who want to develop models to understand and analyze the open-source phenomenon. It is also geared towards policy-makers who would like to encourage or regulate open- source, and towards open-source developers who wonder what are the efficient strategies to make an open-source project successful.TeX, LaTeX, case study, open source, software, innovation, organisational structure, economic history, knowledge production, knowledge diffusion.

    Algebraic specification of documents

    Get PDF
    According to recent research, nearly 95 percent of a corporate information is stored in documents. Further studies indicate that companies spent between 6 and 10 percent of their gross revenues printing and distributing documents in several ways: web and cdrom publishing, database storage and retrieval and printing. In this context documents exist in some different formats, from pure ascii files to internal database or text processor formats. It is clear that document reusability and low-cost maintenance are two important issues in the near future. The majority of available document processors is purpose-oriented, reducing the necessary flexibility and reusability of documents. Some waste of time arises from adapting the same text to different purposes. For example you may want to have the same document as an article as a set of slides or as a poster; or you can have a dictionnary document producing a book and a list of words for a spell-checker. This conversion could be done automatically from the first version of the document if it complies some standard requirements. The key idea will be to keep a complete separation between syntax and semantics. In this way we produce an abstract description separating conceptual issues from those concerned with the use. This note proposes a few guidelines to build a system to solve the above problem. Such a system should be an algebraic based environment and provide facilities for: - Document type definitions; - Definition of functions over document types; - Document definitions as algebraic terms. This approach (rooted in the tradition of constructive algebraic specification), will allow for homogeneous environment to deal with operations such as merging documents, converting formats, translating documents, extracting different kinds of information (to set up information repositories, data bases, or semantic networks) or portions of documents (as it happens, for instance, in literate programming), and some other actions, not so traditional, like mail reply, or memo production. We intend to use CAMILA (a specification language and prototyping environment developed at Universidade do Minho, by the Computer Science group) to develop the above mentioned system

    Document semantics: Two approaches

    Get PDF
    SGML introduced DTD idea to formally describe document syntax and structure. One of its main characteristics is the fact of being purely declarative and fully independent of the future document's processing (typesetting, formatting, translation/transformation). In this context, SGML has become the international standard to be followed. Sooner or later, a document has to be processed. In order to do that we need to associate semantics to the document's structure. In a compiler context, normally we separate semantics in two, static and dynamic. Establishing a parallelism with document processing, we can think of the document's decorated tree (as recognized by a SGML analyzer) as being the static semantics and document's tree transformation and/or reaction as dynamic semantics. Pursuing this idea, we will present and discuss a study of the relationship between SGML, DAST (Decorated Abstract Syntax Tree), and Algebraic Specification tools, in order to better understand how to formally process documents in general and how to specify and build generic document processing tools

    Literate Statistical Practice

    Get PDF
    Literate Statistical Practice (LSP, Rossini, 2001) describes an approach for creating self-documenting statistical results. It applies literate programming (Knuth, 1992) and related techniques in a natural fashion to the practice of statistics. In particular, documentation, specification, and descriptions of results are written concurrently with writing and evaluation of statistical programs. We discuss how and where LSP can be integrated into practice and illustrate this with an example derived from an actual statistical consulting project. The approach is simplified through the use of a comprehensive, open source toolset incorporating Noweb, Emacs Speaks Statistics (ESS), Sweave (Ramsey, 1994; Rossini, et al, 2002; Leisch, 2002; Ihaka and Gentlemen, 1996). We conclude with an assessment of LSP for the construction of reproducible, auditable, and comprehensible statistical analyses

    Conjunctive programming: An interactive approach to software system synthesis

    Get PDF
    This report introduces a technique of software documentation called conjunctive programming and discusses its role in the development and maintenance of software systems. The report also describes the conjoin tool, an adjunct to assist practitioners. Aimed at supporting software reuse while conforming with conventional development practices, conjunctive programming is defined as the extraction, integration, and embellishment of pertinent information obtained directly from an existing database of software artifacts, such as specifications, source code, configuration data, link-edit scripts, utility files, and other relevant information, into a product that achieves desired levels of detail, content, and production quality. Conjunctive programs typically include automatically generated tables of contents, indexes, cross references, bibliographic citations, tables, and figures (including graphics and illustrations). This report presents an example of conjunctive programming by documenting the use and implementation of the conjoin program

    The Investigation of an implementation of SGML based publishing of an graduate thesis

    Get PDF
    The Standard Generalized Markup Language (SGML) has been the International Organization of Standardization (ISO) published standard for text interchange for nearly a decade. Since 1986, SGML based publishing has been successfully implemented in many fields, notably those industries with massive and mission-critical publishing operations such as the military, legal, medical, and heavy industries. SGML based publishing differs from the WYSIWYG paradigm of desktop publishing in that an SGML document contains descriptive, structural markup rather than specific formatting markup. Specific markup describes the appearance of a document and is usually a proprietary code which makes the document difficult to re-use or interchange to different systems. The structurally generic markup codes in an SGML document allow the fullest exploitation of the information. An SGML document exhibits more re-usability than a document created and stored in a proprietary formatting code. In many cases, workflow and production are greatly improved by the implementation of SGML based publishing. Historical and anecdotal case studies of many applications clearly delineate the benefits of an SGML based publishing system. And certainly, the boom in Web publishing has spurred interest in enabling a publishing system with multi-output functionality. However, implementation is associated with high costs. The acquisition of new tools and new skills is a costly investment. A careful cost-benefit analysis must determine that the current publishing needs would be satisfied by moving to SGML. Increased productivity is the measure by which SGML is adopted. The purpose of this thesis project is to investigate the relative benefits and requirements of a simple SGML based publishing implementation. The graduate thesis for most of the School of Printing Management and Sciences at the Rochester Institute of Technology was used as an example. The author has expanded the requirements for the publication process of a graduate thesis with factors which do not exist in reality. The required output has been expanded from mere print output to include publishing on the World Wide Web (WWW) in the Hypertext Markup Language (HTML), and to some proprietary electronic browser such as Folio Views for inclusion in a searchable collection of graduate theses on CD-ROM. A proposed set of tools and methods are discussed in order to clarify the requirements of such an SGML implementation

    TEI and LMF crosswalks

    Get PDF
    The present paper explores various arguments in favour of making the Text Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) . It also identifies the issues that would have to be resolved in order to reach an appropriate implementation of these ideas, in particular in terms of infor-mational coverage. We show how the customisation facilities offered by the TEI guidelines can provide an adequate background, not only to cover missing components within the current Dictionary chapter of the TEI guidelines, but also to allow specific lexical projects to deal with local constraints. We expect this proposal to be a basis for a future ISO project in the context of the on going revision of LMF

    TBX goes TEI -- Implementing a TBX basic extension for the Text Encoding Initiative guidelines

    Get PDF
    This paper presents an attempt to customise the TEI (Text Encoding Initiative) guidelines in order to offer the possibility to incorporate TBX (TermBase eXchange) based terminological entries within any kind of TEI documents. After presenting the general historical, conceptual and technical contexts, we describe the various design choices we had to take while creating this customisation, which in turn have led to make various changes in the actual TBX serialisation. Keeping in mind the objective to provide the TEI guidelines with, again, an onomasiological model, we try to identify the best comprise in maintaining both the isomorphism with the existing TBX Basic standard and the characteristics of the TEI framework
    • 

    corecore