1,087 research outputs found

    Conferentie informatiewetenschap 1999 : Centrum voor Wiskunde en Informatica, 12 november 1999 : proceedings

    Get PDF

    Conferentie informatiewetenschap 1999 : Centrum voor Wiskunde en Informatica, 12 november 1999 : proceedings

    Get PDF

    Web Page Prediction for Web Personalization: A Review

    Get PDF
    This paper proposes a survey of Web Page Ranking for web personalization. Web page prefetching has been widely used to reduce the access latency problem of the Internet. However, if most prefetched web pages are not visited by the users in their subsequent accesses, the limited network bandwidth and server resources will not be used efficiently and may worsen the access delay problem. Therefore, it is critical that we have an accurate prediction method during prefetching. The technique like Markov models have been widely used to represent and analyze user2018;s navigational behavior (usage data) in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users2018; navigation is used to extract popular web paths and predict current users2018; next steps

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    Adaptive hypertext and hypermedia : proceedings of the 2nd workshop, Pittsburgh, Pa., June 20-24, 1998

    Get PDF

    Adaptive hypertext and hypermedia : proceedings of the 2nd workshop, Pittsburgh, Pa., June 20-24, 1998

    Get PDF

    Redefining the Hyperlink

    Get PDF

    DIR 2011: Dutch_Belgian Information Retrieval Workshop Amsterdam

    Get PDF

    Just-in-time hypermedia

    Get PDF
    Many analytical applications, especially legacy systems, create documents and display screens in response to user queries dynamically or in real time . These documents and displays do not exist in advance, and thus hypermedia must be generated \u27just in time -automatically and dynamically. This dissertation details the idea of \u27just-in-time hypermedia and discusses challenges encountered in this research area. A fully detailed literature review about the research issues and related research work is given. A framework for the \u27just-in-time hypermedia compares virtual documents with static documents, as well as dynamic with static hypermedia functionality. Conceptual \u27just-in-time hypermedia architecture is proposed in terms of requirements and logical components. The \u27just-in-time hypermedia engine is described in terms of architecture, functional components, information flow, and implementation details. Then test results are described and evaluated. Lastly, contributions, limitations, and future work are discussed
    • …
    corecore