12 research outputs found

    Numérisation des manuscrits médiévaux : le projet européen BAMBI

    Get PDF
    Colloque: Vers une nouvelle érudition : numérisation et recherche en histoire du livre, Rencontres Jacques Cartier, Lyon, décembre 1999

    Text Line Segmentation of Historical Documents: a Survey

    Full text link
    There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade, and dedicated to documents of historical interest.Comment: 25 pages, submitted version, To appear in International Journal on Document Analysis and Recognition, On line version available at http://www.springerlink.com/content/k2813176280456k3

    Vers une nouvelle érudition : numérisation et recherche en histoire du livre

    Get PDF
    En décembre 1999, à l\u27Enssib, s’est déroulé le colloque "Vers une nouvelle érudition : numérisation et recherche en histoire du livre", organisé dans le cadre des 12e Entretiens du Centre Jacques Cartier sous la responsabilité de Dominique Varry (enssib), Annie Charon (école nationale des chartes) et Guylaine Baudry (Université de Montréal)

    The Philological Workstation BAMBI (Better Access to Manuscripts and Browsing of Images)

    No full text
    This paper presents the results of the European project LIB-3114 for Digital Libraries called BAMBI (Better Access to Manuscripts and Browsing of Images). The project has produced a hypermedia system allowing historians, and more particularly codicologists and philologists, to read manuscripts, transcribe manuscripts, write annotations, and navigate between the words of the transcription and the matching piece of image in the numerized picture of the manuscript. After an introduction on the objectives of the project and the related works, the second part is devoted to the description of the functions and the design of the Philological Workstation. The third part describes how the international standard HyTime (Hypermedia/Time-based Structured Language) has been used as a modelling language to describe works on manuscripts (description, transcription, annotations, links, ...). Finally, the architecture of the BAMBI workstation is presented

    Analyse d’images de documents patrimoniaux : une approche structurelle à base de texture

    Get PDF
    Over the last few years, there has been tremendous growth in digitizing collections of cultural heritage documents. Thus, many challenges and open issues have been raised, such as information retrieval in digital libraries or analyzing page content of historical books. Recently, an important need has emerged which consists in designing a computer-aided characterization and categorization tool, able to index or group historical digitized book pages according to several criteria, mainly the layout structure and/or typographic/graphical characteristics of the historical document image content. Thus, the work conducted in this thesis presents an automatic approach for characterization and categorization of historical book pages. The proposed approach is applicable to a large variety of ancient books. In addition, it does not assume a priori knowledge regarding document image layout and content. It is based on the use of texture and graph algorithms to provide a rich and holistic description of the layout and content of the analyzed book pages to characterize and categorize historical book pages. The categorization is based on the characterization of the digitized page content by texture, shape, geometric and topological descriptors. This characterization is represented by a structural signature. More precisely, the signature-based characterization approach consists of two main stages. The first stage is extracting homogeneous regions. Then, the second one is proposing a graph-based page signature which is based on the extracted homogeneous regions, reflecting its layout and content. Afterwards, by comparing the different obtained graph-based signatures using a graph-matching paradigm, the similarities of digitized historical book page layout and/or content can be deduced. Subsequently, book pages with similar layout and/or content can be categorized and grouped, and a table of contents/summary of the analyzed digitized historical book can be provided automatically. As a consequence, numerous signature-based applications (e.g. information retrieval in digital libraries according to several criteria, page categorization) can be implemented for managing effectively a corpus or collections of books. To illustrate the effectiveness of the proposed page signature, a detailed experimental evaluation has been conducted in this work for assessing two possible categorization applications, unsupervised page classification and page stream segmentation. In addition, the different steps of the proposed approach have been evaluated on a large variety of historical document images.Les récents progrès dans la numérisation des collections de documents patrimoniaux ont ravivé de nouveaux défis afin de garantir une conservation durable et de fournir un accès plus large aux documents anciens. En parallèle de la recherche d'information dans les bibliothèques numériques ou l'analyse du contenu des pages numérisées dans les ouvrages anciens, la caractérisation et la catégorisation des pages d'ouvrages anciens a connu récemment un regain d'intérêt. Les efforts se concentrent autant sur le développement d'outils rapides et automatiques de caractérisation et catégorisation des pages d'ouvrages anciens, capables de classer les pages d'un ouvrage numérisé en fonction de plusieurs critères, notamment la structure des mises en page et/ou les caractéristiques typographiques/graphiques du contenu de ces pages. Ainsi, dans le cadre de cette thèse, nous proposons une approche permettant la caractérisation et la catégorisation automatiques des pages d'un ouvrage ancien. L'approche proposée se veut indépendante de la structure et du contenu de l'ouvrage analysé. Le principal avantage de ce travail réside dans le fait que l'approche s'affranchit des connaissances préalables, que ce soit concernant le contenu du document ou sa structure. Elle est basée sur une analyse des descripteurs de texture et une représentation structurelle en graphe afin de fournir une description riche permettant une catégorisation à partir du contenu graphique (capturé par la texture) et des mises en page (représentées par des graphes). En effet, cette catégorisation s'appuie sur la caractérisation du contenu de la page numérisée à l'aide d'une analyse des descripteurs de texture, de forme, géométriques et topologiques. Cette caractérisation est définie à l'aide d'une représentation structurelle. Dans le détail, l'approche de catégorisation se décompose en deux étapes principales successives. La première consiste à extraire des régions homogènes. La seconde vise à proposer une signature structurelle à base de texture, sous la forme d'un graphe, construite à partir des régions homogènes extraites et reflétant la structure de la page analysée. Cette signature assure la mise en œuvre de nombreuses applications pour gérer efficacement un corpus ou des collections de livres patrimoniaux (par exemple, la recherche d'information dans les bibliothèques numériques en fonction de plusieurs critères, ou la catégorisation des pages d'un même ouvrage). En comparant les différentes signatures structurelles par le biais de la distance d'édition entre graphes, les similitudes entre les pages d'un même ouvrage en termes de leurs mises en page et/ou contenus peuvent être déduites. Ainsi de suite, les pages ayant des mises en page et/ou contenus similaires peuvent être catégorisées, et un résumé/une table des matières de l'ouvrage analysé peut être alors généré automatiquement. Pour illustrer l'efficacité de la signature proposée, une étude expérimentale détaillée a été menée dans ce travail pour évaluer deux applications possibles de catégorisation de pages d'un même ouvrage, la classification non supervisée de pages et la segmentation de flux de pages d'un même ouvrage. En outre, les différentes étapes de l'approche proposée ont donné lieu à des évaluations par le biais d'expérimentations menées sur un large corpus de documents patrimoniaux

    Screen Space Reconfigured

    Get PDF
    Screen Space Reconfigured is the first edited volume that critically and theoretically examines the many novel renderings of space brought to us by 21st century screens. Exploring key cases such as post-perspectival space, 3D, vertical framing, haptics, and layering, this volume takes stock of emerging forms of screen space and spatialities as they move from the margins to the centre of contemporary media practice.Recent years have seen a marked scholarly interest in spatial dimensions and conceptions of moving image culture, with some theorists claiming that a 'spatial turn' has taken place in media studies and screen practices alike. Yet this is the first book-length study dedicated to on-screen spatiality as such.Spanning mainstream cinema, experimental film, video art, mobile screens, and stadium entertainment, the volume includes contributions from such acclaimed authors as Giuliana Bruno and Tom Gunning as well as a younger generation of scholars

    A Holmes and Doyle Bibliography, Volume 6: Periodical Articles, Subject Listing, By De Waal Category

    Get PDF
    This bibliography is a work in progress. It attempts to update Ronald B. De Waal’s comprehensive bibliography, The Universal Sherlock Holmes, but does not claim to be exhaustive in content. New works are continually discovered and added to this bibliography. Readers and researchers are invited to suggest additional content. Volume 6 presents the periodical literature arranged by subject categories (as originally devised for the De Waal bibliography and slightly modified here)

    A Holmes and Doyle Bibliography, Volume 9: All Formats—Combined Alphabetical Listing

    Get PDF
    This bibliography is a work in progress. It attempts to update Ronald B. De Waal’s comprehensive bibliography, The Universal Sherlock Holmes, but does not claim to be exhaustive in content. New works are continually discovered and added to this bibliography. Readers and researchers are invited to suggest additional content. This volume contains all listings in all formats, arranged alphabetically by author or main entry. In other words, it combines the listings from Volume 1 (Monograph and Serial Titles), Volume 3 (Periodical Articles), and Volume 7 (Audio/Visual Materials) into a comprehensive bibliography. (There may be additional materials included in this list, e.g. duplicate items and items not yet fully edited.) As in the other volumes, coverage of this material begins around 1994, the final year covered by De Waal's bibliography, but may not yet be totally up-to-date (given the ongoing nature of this bibliography). It is hoped that other titles will be added at a later date. At present, this bibliography includes 12,594 items
    corecore