125,468 research outputs found

    Barcode Annotations for Medical Image Retrieval: A Preliminary Investigation

    Full text link
    This paper proposes to generate and to use barcodes to annotate medical images and/or their regions of interest such as organs, tumors and tissue types. A multitude of efficient feature-based image retrieval methods already exist that can assign a query image to a certain image class. Visual annotations may help to increase the retrieval accuracy if combined with existing feature-based classification paradigms. Whereas with annotations we usually mean textual descriptions, in this paper barcode annotations are proposed. In particular, Radon barcodes (RBC) are introduced. As well, local binary patterns (LBP) and local Radon binary patterns (LRBP) are implemented as barcodes. The IRMA x-ray dataset with 12,677 training images and 1,733 test images is used to verify how barcodes could facilitate image retrieval.Comment: To be published in proceedings of The IEEE International Conference on Image Processing (ICIP 2015), September 27-30, 2015, Quebec City, Canad

    Learning with Multi-modal Gradient Attention for Explainable Composed Image Retrieval

    Full text link
    We consider the problem of composed image retrieval that takes an input query consisting of an image and a modification text indicating the desired changes to be made on the image and retrieves images that match these changes. Current state-of-the-art techniques that address this problem use global features for the retrieval, resulting in incorrect localization of the regions of interest to be modified because of the global nature of the features, more so in cases of real-world, in-the-wild images. Since modifier texts usually correspond to specific local changes in an image, it is critical that models learn local features to be able to both localize and retrieve better. To this end, our key novelty is a new gradient-attention-based learning objective that explicitly forces the model to focus on the local regions of interest being modified in each retrieval step. We achieve this by first proposing a new visual image attention computation technique, which we call multi-modal gradient attention (MMGrad) that is explicitly conditioned on the modifier text. We next demonstrate how MMGrad can be incorporated into an end-to-end model training strategy with a new learning objective that explicitly forces these MMGrad attention maps to highlight the correct local regions corresponding to the modifier text. By training retrieval models with this new loss function, we show improved grounding by means of better visual attention maps, leading to better explainability of the models as well as competitive quantitative retrieval performance on standard benchmark datasets

    Color image retrieval using taken images

    Get PDF
    Now-a-days in many applications content based image retrieval from large resources has become an area of wide interest. In this paper we present a color-based image retrieval system that uses color and texture as visual features to describe the content of an image region. To speed up retrieval and similarity computation, the database images are segmented and the extracted regions are clustered according to their feature vectors. This process is performed offline before query processing, therefore to answer a query our system need not search the entire database images; instead just a number of candidate images are required to be searched for image similarity. Our proposed system has the advantage of increasing the retrieval accuracy and decreasing the retrieval time. The experimental evaluation of the system is based on a 1,000 real taken color image database. From the experimental results, it is evident that our system performs significantly better and faster compared with other existing systems. In our analysis, we provide a comparison between retrieval results based on relevancy for the given ten classes. The results demonstrate that each type of feature is effective for a particular type of images according to its semantic contents, and using a combination of them gives better retrieval results for almost all semantic classes

    Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

    Full text link
    Unsupervised segmentation of action segments in egocentric videos is a desirable feature in tasks such as activity recognition and content-based video retrieval. Reducing the search space into a finite set of action segments facilitates a faster and less noisy matching. However, there exist a substantial gap in machine understanding of natural temporal cuts during a continuous human activity. This work reports on a novel gaze-based approach for segmenting action segments in videos captured using an egocentric camera. Gaze is used to locate the region-of-interest inside a frame. By tracking two simple motion-based parameters inside successive regions-of-interest, we discover a finite set of temporal cuts. We present several results using combinations (of the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains egocentric videos depicting several daily-living activities. The quality of the temporal cuts is further improved by implementing two entropy measures.Comment: To appear in 2017 IEEE International Conference On Signal and Image Processing Application

    CoHOG: A Light-Weight, Compute-Efficient, and Training-Free Visual Place Recognition Technique for Changing Environments

    Get PDF
    This letter presents a novel, compute-efficient and training-free approach based on Histogram-of-Oriented-Gradients (HOG) descriptor for achieving state-of-the-art performance-per-compute-unit in Visual Place Recognition (VPR). The inspiration for this approach (namely CoHOG) is based on the convolutional scanning and regions-based feature extraction employed by Convolutional Neural Networks (CNNs). By using image entropy to extract regions-of-interest (ROI) and regional-convolutional descriptor matching, our technique performs successful place recognition in changing environments. We use viewpoint- and appearance-variant public VPR datasets to report this matching performance, at lower RAM commitment, zero training requirements and 20 times lesser feature encoding time compared to state-of-the-art neural networks. We also discuss the image retrieval time of CoHOG and the effect of CoHOG's parametric variation on its place matching performance and encoding time

    Detecting regions of interest using eye tracking for CBIR

    Get PDF
    Identifying Regions of Interest (ROIs) in images has been shown an effective way to enhance the performance of Content Based Image Retrieval (CBIR). Most existing ROI identification methods are based on salience detection, and the identified ROIs may not be the regions that users are really interested in. While manual selection of ROIs can directly reflect users’ interests, it puts extra cognitive overhead to users. To alleviate these limitations, in this paper, we propose a novel eye-tracking based method to detect ROIs for CBIR, in an unobtrusive way. Experimental results have demonstrated that our model performed effectively compared with various state of the art methods

    Query Region Determination based on Region Importance Index and Relative Position for Region-based Image Retrieval

    Get PDF
    An efficient Region-Based Image Retrieval (RBIR) system must consider query region determination techniques and target regions in the retrieval process. A query region is a region that must contain a Region of Interest (ROI) or saliency region. A query region determination can be specified manually or automatically. However, manual determination is considered less efficient and tedious for users. The selected query region must determine specific target regions in the image collection to reduce the retrieval time. This study proposes a strategy of query region determination based on the Region Importance Index (RII) value and relative position of the Saliency Region Overlapping Block (SROB) to produce a more efficient RBIR. The entire region is formed by using the mean shift segmentation method. The RII value is calculated based on a percentage of the region area and region distance to the center of the image. Whereas the target regions are determined by considering the relative position of SROB, the performance of the proposed method is tested on a CorelDB dataset. Experimental results show that the proposed method can reduce the Average of Retrieval Time to 0.054 seconds with a 5x5 block size configuration

    Conditional Attention for Content-based Image Retrieval

    Get PDF
    Deep learning based feature extraction combined with visual attention mechanism is shown to provide good results in content-based image retrieval (CBIR). Ideally, CBIR should rely on regions which contain objects of interest that appear in the query image. However, most existing attention models just predict the most likely region of interest based on the knowledge learned from the training dataset regardless of the content in the query image. As a result, they may look towards contexts outside the object of interest, especially when there are multiple potential objects of interest in a given image. In this paper, we propose a conditional attention model which is sensitive to the input query image content and can generate more accurate attention maps. A key-point detection and description based method is proposed for training data generation. Consequently, our model does not require any additional attention label for training. The proposed attention model enables the spatial pooling feature extraction method (generalized mean pooling) improves image feature representation and leads to better image retrieval performance. The proposed framework is tested on a series of databases where it is shown to perform well in challenging situations
    • …
    corecore