158 research outputs found

    Visual Information Retrieval in Endoscopic Video Archives

    Get PDF
    In endoscopic procedures, surgeons work with live video streams from the inside of their subjects. A main source for documentation of procedures are still frames from the video, identified and taken during the surgery. However, with growing demands and technical means, the streams are saved to storage servers and the surgeons need to retrieve parts of the videos on demand. In this submission we present a demo application allowing for video retrieval based on visual features and late fusion, which allows surgeons to re-find shots taken during the procedure.Comment: Paper accepted at the IEEE/ACM 13th International Workshop on Content-Based Multimedia Indexing (CBMI) in Prague (Czech Republic) between 10 and 12 June 201

    Review of Microscopic Image Processing techniques towards Malaria Infected Erythrocyte Detection from Thin Blood Smears

    Get PDF
    In order to diagnose malaria, the test that has traditionally been conducted is the gold standard test. The process mainly entails the preparation of a blood smear on glass slide, staining the blood and examining the blood through the use of a microscope so as to observe parasite genus plasmodium. Although these are several other kinds of diagnostic test solutions that are available and which can be adopted, there are numerous shortcomings which are always observed when microscopic analysis is carried out. Presently, the treatments are hugely conducted based on symptoms and upon the occurrence of false negatives, it might be fatal and may result into the creation of different kinds of implications. There have been a number of deaths which have been associated with malaria and as a result, there is the dire need to ensure that there is early detection of malarial infection among the people. This manuscript mainly provides a review of the current contributions regarding computer aided strategies, as well as microscopic image processing strategies for the detection of malaria. They are discussed based on the contemporary literature

    An Extreme Learning Machine-Relevance Feedback Framework for Enhancing the Accuracy of a Hybrid Image Retrieval System

    Get PDF
    The process of searching, indexing and retrieving images from a massive database is a challenging task and the solution to these problems is an efficient image retrieval system. In this paper, a unique hybrid Content-based image retrieval system is proposed where different attributes of an image like texture, color and shape are extracted by using Gray level co-occurrence matrix (GLCM), color moment and various region props procedure respectively. A hybrid feature matrix or vector (HFV) is formed by an integration of feature vectors belonging to three individual visual attributes. This HFV is given as an input to an Extreme learning machine (ELM) classifier which is based on a solitary hidden layer of neurons and also is a type of feed-forward neural system. ELM performs efficient class prediction of the query image based on the pre-trained data. Lastly, to capture the high level human semantic information, Relevance feedback (RF) is utilized to retrain or reformulate the training of ELM. The advantage of the proposed system is that a combination of an ELM-RF framework leads to an evolution of a modified learning and intelligent classification system. To measure the efficiency of the proposed system, various parameters like Precision, Recall and Accuracy are evaluated. Average precision of 93.05%, 81.03%, 75.8% and 90.14% is obtained respectively on Corel-1K, Corel-5K, Corel-10K and GHIM-10 benchmark datasets. The experimental analysis portrays that the implemented technique outmatches many state-of-the-art related approaches depicting varied hybrid CBIR system

    Towards Learning Representations in Visual Computing Tasks

    Get PDF
    abstract: The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos. The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss. In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    The TREC-2002 video track report

    Get PDF
    TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings

    Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model

    No full text
    International audienceThis paper presents a novel approach to incorporate spatial information in the bag-of-visual-words model for category level and scene classification. In the traditional bag-of-visual-words model, feature vectors are histograms of visual words. This representation is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. In this framework, we present a simple and effi- cient way to infuse spatial information. Particularly, we are interested in explicit global relationships among the spatial positions of visual words. Therefore, we take advantage of the orientation of the segments formed by Pairs of Identical visual Words (PIW). An evenly distributed normalized histogram of angles of PIW is computed. Histograms pro- duced by each word type constitute a powerful description of intra type visual words relationships. Experiments on challenging datasets demonstrate that our method is com- petitive with the concurrent ones. We also show that, our method provides important complementary information to the spatial pyramid matching and can improve the overall performance

    A Method Of Content-based Image Retrieval For The Generation Of Image Mosaics

    Get PDF
    An image mosaic is an artistic work that uses a number of smaller images creatively combined together to form another larger image. Each building block image, or tessera, has its own distinctive and meaningful content, but when viewed from a distance the tesserae come together to form an aesthetically pleasing montage. This work presents the design and implementation of MosaiX, a computer software system that generates these image mosaics automatically. To control the image mosaic creation process, several parameters are used within the system. Each parameter affects the overall mosaic quality, as well as required processing time, in its own unique way. A detailed analysis is performed to evaluate each parameter individually. Additionally, this work proposes two novel ways by which to evaluate the quality of an image mosaic in a quantitative way. One method focuses on the perceptual color accuracy of the mosaic reproduction, while the other concentrates on edge replication. Both measures include preprocessing to take into account the unique visual features present in an image mosaic. Doing so minimizes quality penalization due the inherent properties of an image mosaic that make them visually appealing
    corecore