71 research outputs found

    Indexing of Reading Paths for a Structured Information Retrieval on the Web

    No full text
    International audienceIn this paper, we present a hyperdocument model taking into account the essential aspects of information on the Web: content, composition (logical structure) and nonlinear reading (hypertext structure). We have developed a Structured Information Retrieval System (SIRS) based on this model. Its phases of indexing and querying are based on a “reading paths” point of view of the Web: a Web site is considered as a set of potential reading paths, instead of a set of atomic and flat pages. We have developed an specific algorithm to index the reading paths. We present some experiments aiming at evaluating the interest of our indexing process of reading paths

    The Use of Latent Semantic Indexing to Mitigate OCR Effects of Related Document Images

    Get PDF
    Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.CNPq[557976/2008-1]FAPESP[05/60038-5]FAPESP[05/60729-8]FAPESP[06/58984-2]FAPESP[09/14292-8]FAPESP[2009/05504-1]Spanish Ministerio de Ciencia e Innovacion[TIN2008-06566-C04-04]FEDERXunta de Galicia[07SIN005206PR]Innolution Sistemas de Informatic

    Integrated content presentation for multilingual and multimedia information access

    Get PDF
    For multilingual and multimedia information retrieval from multiple potentially distributed collections generating the output in the form of standard ranked lists may often mean that a user has to explore the contents of many lists before finding sufficient relevant or linguistically accessible material to satisfy their information need. In some situations delivering an integrated multilingual multimedia presentation could enable the user to explore a topic allowing them to select from among a range of available content based on suitably chosen displayed metadata. A presentation of this type has similarities with the outputs of existing adaptive hypermedia systems. However, such systems are generated based on “closed” content with sophisticated user and domain models. Extending them to “open” domain information retrieval applications would raise many issues. We present an outline exploration of what will form a challenging new direction for research in multilingual information access

    A Solid Surface from which to Explore: The Executive Summary as the Frontispiece of a Technical Hyperdocument

    Get PDF
    This research analyzes the purpose and organization of a well-written executive summary, reviews the characteristics of hypermedia, and compares the strengths and weaknesses of a typical executive summary published originally on paper and now under consideration for transition to hypermedia in order to determine which elements of an executive summary are essential before effective transition to hypermedia. The executive summary of a technical report is a stand-alone description of what was done, how it was done, what the results were, why they matter, and where further information is found in the report body. Executive summaries have a wider audience than the entire report and thus are written in non-technical language. To use the features of hypermedia (linked text, sound, animation, pictures) to create an effective online executive summary, several crucial executive summary elements must be in place in order for it to serve as a solid hyperdocument anchor node: what was done, how it was done, what the results were, and why they matter. Traditional elements such as implications and an orientation to the body of the report need not be included in an online executive summary. (AN

    DYNAMIC HYPERTEXT SYNTHESIS FOR INFORMATION RETRIEVAL

    Get PDF
    Hypertext navigation alone is insufficient for efficient Information Retrieval (IR). Previous attempts to combine IR techniques with hypertext have been confined to the pre-authored structure of a document. In this paper we extend computer-science methods to synthesize a tailor-made hypertext document in response to each user's query. The synthesis technique can also be used to automatically create a pre-authored hypertext document according to an author's specifications.Information Systems Working Papers Serie

    "Scholarly Hypertext: Self-Represented Complexity"

    Get PDF
    Scholarly hypertexts involve argument and explicit selfquestioning, and can be distinguished from both informational and literary hypertexts. After making these distinctions the essay presents general principles about attention, some suggestions for self-representational multi-level structures that would enhance scholarly inquiry, and a wish list of software capabilities to support such structures. The essay concludes with a discussion of possible conflicts between scholarly inquiry and hypertext

    The guiding process in discovery hypertext learning environments for the Internet

    Get PDF
    Hypertext is the dominant method to navigate the Internet, providing user freedom and control over navigational behaviour. There has been an increase in converting existing educational material into Internet web pages but weaknesses have been identified in current WWW learning systems. There is a lack of conceptual support for learning from hypertext, navigational disorientation and cognitive overload. This implies the need for an established pedagogical approach to developing the web as a teaching and learning medium. Guided Discovery Learning is proposed as an educational pedagogy suitable for supporting WWW learning. The hypothesis is that a guided discovery environment will produce greater gains in learning and satisfaction, than a non-adaptive hypertext environment. A second hypothesis is that combining concept maps with this specific educational paradigm will provide cognitive support. The third hypothesis is that student learning styles will not influence learning outcome or user satisfaction. Thus, providing evidence that the guided discovery learning paradigm can be used for many types of learning styles. This was investigated by the building of a guided discovery system and a framework devised for assessing teaching styles. The system provided varying discovery steps, guided advice, individualistic system instruction and navigational control. An 84 subject experiment compared a Guided discovery condition, a Map-only condition and an Unguided condition. Subjects were subdivided according to learning styles, with measures for learning outcome and user satisfaction. The results indicate that providing guidance will result in a significant increase in level of learning. Guided discovery condition subjects, regardless of learning styles, experienced levels of satisfaction comparable to those in the other conditions. The concept mapping tool did not appear to affect learning outcome or user satisfaction. The conclusion was that using a particular approach to guidance would result in a more supportive environment for learning. This research contributes to the need for a better understanding of the pedagogic design that should be incorporated into WWW learning environments, with a recommendation for a guided discovery approach to alleviate major hypertext and WWW issues for distance learning

    Conferentie informatiewetenschap 1999 : Centrum voor Wiskunde en Informatica, 12 november 1999 : proceedings

    Get PDF

    Conferentie informatiewetenschap 1999 : Centrum voor Wiskunde en Informatica, 12 november 1999 : proceedings

    Get PDF
    • 

    corecore