605 research outputs found

    Optimization of the search engine ElasticSearch

    Get PDF
    This thesis will present the work done in the Search on Demand team at Orange. It will present the optimization of the search engine Elasticsearch, the ways to bring data into it with the mean of an ETL and how relevance can be tuned using Lucene's inverted indices

    Indexation, memory, power and representations at the beginning of the 12th century : The rediscovery of pages from the tables to the "Liber de honoribus", the first cartulary of the collegiate Church of St. Julian of auvergne (Brioude)

    Get PDF
    International audiencePublication du plus vieille index alphabétique "de fin de livre" connu, compilé au commencement du XIIe siècle pour servir au Grand cartulaire de Saint-Julien de Brioude, en Auvergne, afin de hiérarchiser et organiser les entrées géographiques des bien-fonds principaux mentionnés dans ses chartes

    A web information system for the management and the dissemination of Cultural Heritage data.

    No full text
    Safeguarding and exploiting Cultural Heritage induce the production of numerous and heterogeneous data. The management of these data is an essential task for the use and the diffusion of the information gathered on the field. Previously, the data handling was a hand-made task done thanks to efficient and experienced methods. Until the growth of computer science, other methods have been carried out for the digital preservation and treatment of Cultural Heritage information. The development of computerized data management systems to store and make use of archaeological datasets is then a significant task nowadays. Especially for sites that have been excavated and worked without computerized means, it is now necessary to put all the data produced onto computer. This allows preservation of the information digitally (in addition with the paper documents) and offers new exploitation possibilities, like the immediate connection of different kinds of data for analyses, or the digital documentation of the site for its improvement. Geographical Information Systems have proved their potentialities in this scope, but they are not always adapted to the management of features at the scale of a particular archaeological site. Therefore this paper aims to present the development of a Virtual Research Environment dedicated to the exploitation of intra-site Cultural Heritage data. The Information System produced is based on open-source software modules dedicated to the Internet, so users can avoid being software driven and can register and consult data from different computers. The system gives the opportunity to do exploratory analyses of the data, especially at spatial and temporal levels. The system is compliant to every kind of Cultural Heritage site and allows management of diverse types of data. Some experimentation has been done on sites managed by the Service of the National Sites and Monuments of Luxembourg

    An analysis of the use of graphics for information retrieval

    Get PDF
    Several research groups have addressed the problem of retrieving vector graphics. This work has, however, focused either on domain-dependent areas or was based on very simple graphics languages. Here we take a fresh look at the issue of graphics retrieval in general and in particular at the tasks which retrieval systems must support. The paper presents a series of case studies which explored the needs of professionals in the hope that these needs can help direct future graphics IR research. Suggested modelling techniques for some of the graphic collections are also presented

    Archimorphosis

    Get PDF
    International audienc

    Enriching Historical Manuscripts: The Bovary Project

    Full text link
    International audienceIn this paper we describe the Bovary Project, a manuscripts digitization project of the famous French writer Gustave FLAUBERT's first great work, which should end in 2006 by providing an online access to an hypertextual edition of "Madame Bovary" drafts set. We rst develop the global context of this project, the main objectives, and then focus particularly on the document analysis problem. Finally we propose a new approach for the segmentation of handwritten documents

    A framework for supporting knowledge representation – an ontological based approach

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de ComputadoresThe World Wide Web has had a tremendous impact on society and business in just a few years by making information instantly available. During this transition from physical to electronic means for information transport, the content and encoding of information has remained natural language and is only identified by its URL. Today, this is perhaps the most significant obstacle to streamlining business processes via the web. In order that processes may execute without human intervention, knowledge sources, such as documents, must become more machine understandable and must contain other information besides their main contents and URLs. The Semantic Web is a vision of a future web of machine-understandable data. On a machine understandable web, it will be possible for programs to easily determine what knowledge sources are about. This work introduces a conceptual framework and its implementation to support the classification and discovery of knowledge sources, supported by the above vision, where such sources’ information is structured and represented through a mathematical vector that semantically pinpoints the relevance of those knowledge sources within the domain of interest of each user. The presented work also addresses the enrichment of such knowledge representations, using the statistical relevance of keywords based on the classical vector space model concept, and extending it with ontological support, by using concepts and semantic relations, contained in a domain-specific ontology, to enrich knowledge sources’ semantic vectors. Semantic vectors are compared against each other, in order to obtain the similarity between them, and better support end users with knowledge source retrieval capabilities
    • …
    corecore