102,733 research outputs found

    Exploiting multimedia content : a machine learning based approach

    Get PDF
    Advisors: Prof. M Gopal, Prof. Santanu Chaudhury. Date and location of PhD thesis defense: 10 September 2013, Indian Institute of Technology DelhiThis thesis explores use of machine learning for multimedia content management involving single/multiple features, modalities and concepts. We introduce shape based feature for binary patterns and apply it for recognition and retrieval application in single and multiple feature based architecture. The multiple feature based recognition and retrieval frameworks are based on the theory of multiple kernel learning (MKL). A binary pattern recognition framework is presented by combining the binary MKL classifiers using a decision directed acyclic graph. The evaluation is shown for Indian script character recognition, and MPEG7 shape symbol recognition. A word image based document indexing framework is presented using the distance based hashing (DBH) defined on learned pivot centres. We use a new multi-kernel learning scheme using a Genetic Algorithm for developing a kernel DBH based document image retrieval system. The experimental evaluation is presented on document collections of Devanagari, Bengali and English scripts. Next, methods for document retrieval using multi-modal information fusion are presented. Text/Graphics segmentation framework is presented for documents having a complex layout. We present a novel multi-modal document retrieval framework using the segmented regions. The approach is evaluated on English magazine pages. A document script identification framework is presented using decision level aggregation of page, paragraph and word level prediction. Latent Dirichlet Allocation based topic modelling with modified edit distance is introduced for the retrieval of documents having recognition inaccuracies. A multi-modal indexing framework for such documents is presented by a learning based combination of text and image based properties. Experimental results are shown on Devanagari script documents. Finally, we have investigated concept based approaches for multimedia analysis. A multi-modal document retrieval framework is presented by combining the generative and discriminative modelling for exploiting the cross-modal correlation between modalities. The combination is also explored for semantic concept recognition using multi-modal components of the same document, and different documents over a collection. An experimental evaluation of the framework is shown for semantic event detection in sport videos, and semantic labelling of components of multi-modal document images

    CONTENT BASED INFORMATION RETRIEVAL FOR DIGITAL LIBRARY USING DOCUMENT IMAGE

    Get PDF
    In the recent year, the using of mobile devices has perceive an emerging need for improving the user experience of digital library for search, with various applications such as education, location search and product retrieval, There simply compare the query to the databases images; those are match that images are retrieve from the database, searching and response time of delivery staying a challenging issues in mobile document search previously lots of work has been done on search engine, retrieving the document from the database without analyzed the image. In The proposed method, Information retrieval for image based query automatically with a mobile document information retrieval framework, consisting of a FP-growth is proposed finding frequent pattern from the retrieve document to optimize the result

    Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database

    Full text link
    Radiologists in their daily work routinely find and annotate significant abnormalities on a large number of radiology images. Such abnormalities, or lesions, have collected over years and stored in hospitals' picture archiving and communication systems. However, they are basically unsorted and lack semantic annotations like type and location. In this paper, we aim to organize and explore them by learning a deep feature representation for each lesion. A large-scale and comprehensive dataset, DeepLesion, is introduced for this task. DeepLesion contains bounding boxes and size measurements of over 32K lesions. To model their similarity relationship, we leverage multiple supervision information including types, self-supervised location coordinates and sizes. They require little manual annotation effort but describe useful attributes of the lesions. Then, a triplet network is utilized to learn lesion embeddings with a sequential sampling strategy to depict their hierarchical similarity structure. Experiments show promising qualitative and quantitative results on lesion retrieval, clustering, and classification. The learned embeddings can be further employed to build a lesion graph for various clinically useful applications. We propose algorithms for intra-patient lesion matching and missing annotation mining. Experimental results validate their effectiveness.Comment: Accepted by CVPR2018. DeepLesion url adde

    A lightweight web video model with content and context descriptions for integration with linked data

    Get PDF
    The rapid increase of video data on the Web has warranted an urgent need for effective representation, management and retrieval of web videos. Recently, many studies have been carried out for ontological representation of videos, either using domain dependent or generic schemas such as MPEG-7, MPEG-4, and COMM. In spite of their extensive coverage and sound theoretical grounding, they are yet to be widely used by users. Two main possible reasons are the complexities involved and a lack of tool support. We propose a lightweight video content model for content-context description and integration. The uniqueness of the model is that it tries to model the emerging social context to describe and interpret the video. Our approach is grounded on exploiting easily extractable evolving contextual metadata and on the availability of existing data on the Web. This enables representational homogeneity and a firm basis for information integration among semantically-enabled data sources. The model uses many existing schemas to describe various ontology classes and shows the scope of interlinking with the Linked Data cloud
    • …
    corecore