50 research outputs found

    Using Speech Input for Image Interpretation, Annotation, and Retrieval

    Get PDF
    "This research explores the interaction of textual and photographic information in an integrated text/image database environment. Specifically, three different applications involving the exploitation of linguistic con-text in vision are presented. Linguistic context is qualitative in nature and is obtained dynamically. By understanding text accompanying images or video, we are able to extract information useful in retrieving the picture and directing an image interpretation system to identify relevant objects (e.g., faces) in the picture. The latter constitutes a powerful technique for automatically indexing images. A multistage system, PICTION, which uses captions to identify human faces in an accompanying photograph, has been developed. We discuss the use of PICTION's output in content-based retrieval of images to satisfy focus of attention in queries. The design and implementation of a system called Show&Tell???a multimedia system for semi-automated image annotation???is discussed. This system, which combines advances in speech recognition, natural language processing (NLP), and image understanding (IU), is designed to assist in image annotation and to enhance image retrieval capabilities. An extension of this work to video annotation and retrieval is also presented."published or submitted for publicatio

    Exploiting Multimodal Context in Image Retrieval

    Get PDF
    published or submitted for publicatio

    Use of lexical and syntactic techniques in recognizing handwritten text

    No full text
    The output of handwritten word recognizers (Wit) tends to be very noisy due to various factors. In order to compensate for this behaviour, several choices of the WR must be ini-tially considered. In the case of handwritten sentence/phrase recognition, linguistic constraints may be applied in order to improve the results of the Wit. This paper discusses two statistical methods of applying linguistic constraints to the output of an Wit on input consisting of sentences/phrases. The first is based on collocations and can be used to prOmote lower ranked word choices or to propose new words. The second is a Markov model of syntax and is based on syn-tactic categories (tags) associated with words. In each case, we show the improvement in the word recognition rate as a result of applying these constraints. 1

    Use of Language Models in Handwriting Recognition

    No full text
    Language models have been extensively used in natural language applications such as speech recognition, part-of-speech tagging, information extraction, etc. To a lesser extent the value of language models in text recognition has also been proved, e.g., recognition of poor quality printed text and the recognition of extended handwriting. This survey describes how linguistic context, particularly probabilistic language models, are used in the recognition of handwritten text. The survey begins with two handwriting recogniton techniques, segmentation-free and segmentation-based, are integrated with language models in the recognition process. Next, language models at the word level in the post processing step to improve the recognition results and at the character level for handwriting recognition 1 and correction of recognition results are described. Finally, syntax based techniques like lexical analysis using collocations, syntactic (n-gram) analysis using part-of-speech (POS) tags, and a hybrid syntactic technique comprised of both a statistical and an analytical component are described. Language modeling has been found to be very helpful for all natural language applications. They have been seen to improve the performance of these application by 25-50 % when the text used in training is representative of that for which the model is intended. 2 I
    corecore