125,468 research outputs found
Barcode Annotations for Medical Image Retrieval: A Preliminary Investigation
This paper proposes to generate and to use barcodes to annotate medical
images and/or their regions of interest such as organs, tumors and tissue
types. A multitude of efficient feature-based image retrieval methods already
exist that can assign a query image to a certain image class. Visual
annotations may help to increase the retrieval accuracy if combined with
existing feature-based classification paradigms. Whereas with annotations we
usually mean textual descriptions, in this paper barcode annotations are
proposed. In particular, Radon barcodes (RBC) are introduced. As well, local
binary patterns (LBP) and local Radon binary patterns (LRBP) are implemented as
barcodes. The IRMA x-ray dataset with 12,677 training images and 1,733 test
images is used to verify how barcodes could facilitate image retrieval.Comment: To be published in proceedings of The IEEE International Conference
on Image Processing (ICIP 2015), September 27-30, 2015, Quebec City, Canad
Learning with Multi-modal Gradient Attention for Explainable Composed Image Retrieval
We consider the problem of composed image retrieval that takes an input query
consisting of an image and a modification text indicating the desired changes
to be made on the image and retrieves images that match these changes. Current
state-of-the-art techniques that address this problem use global features for
the retrieval, resulting in incorrect localization of the regions of interest
to be modified because of the global nature of the features, more so in cases
of real-world, in-the-wild images. Since modifier texts usually correspond to
specific local changes in an image, it is critical that models learn local
features to be able to both localize and retrieve better. To this end, our key
novelty is a new gradient-attention-based learning objective that explicitly
forces the model to focus on the local regions of interest being modified in
each retrieval step. We achieve this by first proposing a new visual image
attention computation technique, which we call multi-modal gradient attention
(MMGrad) that is explicitly conditioned on the modifier text. We next
demonstrate how MMGrad can be incorporated into an end-to-end model training
strategy with a new learning objective that explicitly forces these MMGrad
attention maps to highlight the correct local regions corresponding to the
modifier text. By training retrieval models with this new loss function, we
show improved grounding by means of better visual attention maps, leading to
better explainability of the models as well as competitive quantitative
retrieval performance on standard benchmark datasets
Color image retrieval using taken images
Now-a-days in many applications content based image retrieval from large resources has become an area of wide interest. In this paper we present a color-based image retrieval system that uses color and texture as visual features to describe the content of an image region. To speed up retrieval and similarity computation, the database images are segmented and the extracted regions are clustered according to their feature vectors. This process is performed offline before query processing, therefore to answer a query our system need not search the entire database images; instead just a number of candidate images are required to be searched for image similarity. Our proposed system has the advantage of increasing the retrieval accuracy and decreasing the retrieval time. The experimental evaluation of the system is based on a 1,000 real taken color image database. From the experimental results, it is evident that our system performs significantly better and faster compared with other existing systems. In our analysis, we provide a comparison between retrieval results based on relevancy for the given ten classes. The results demonstrate that each type of feature is effective for a particular type of images according to its semantic contents, and using a combination of them gives better retrieval results for almost all semantic classes
Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze
Unsupervised segmentation of action segments in egocentric videos is a
desirable feature in tasks such as activity recognition and content-based video
retrieval. Reducing the search space into a finite set of action segments
facilitates a faster and less noisy matching. However, there exist a
substantial gap in machine understanding of natural temporal cuts during a
continuous human activity. This work reports on a novel gaze-based approach for
segmenting action segments in videos captured using an egocentric camera. Gaze
is used to locate the region-of-interest inside a frame. By tracking two simple
motion-based parameters inside successive regions-of-interest, we discover a
finite set of temporal cuts. We present several results using combinations (of
the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains
egocentric videos depicting several daily-living activities. The quality of the
temporal cuts is further improved by implementing two entropy measures.Comment: To appear in 2017 IEEE International Conference On Signal and Image
Processing Application
CoHOG: A Light-Weight, Compute-Efficient, and Training-Free Visual Place Recognition Technique for Changing Environments
This letter presents a novel, compute-efficient and training-free approach based on Histogram-of-Oriented-Gradients (HOG) descriptor for achieving state-of-the-art performance-per-compute-unit in Visual Place Recognition (VPR). The inspiration for this approach (namely CoHOG) is based on the convolutional scanning and regions-based feature extraction employed by Convolutional Neural Networks (CNNs). By using image entropy to extract regions-of-interest (ROI) and regional-convolutional descriptor matching, our technique performs successful place recognition in changing environments. We use viewpoint- and appearance-variant public VPR datasets to report this matching performance, at lower RAM commitment, zero training requirements and 20 times lesser feature encoding time compared to state-of-the-art neural networks. We also discuss the image retrieval time of CoHOG and the effect of CoHOG's parametric variation on its place matching performance and encoding time
Detecting regions of interest using eye tracking for CBIR
Identifying Regions of Interest (ROIs) in images has been shown an effective way to enhance the performance of Content Based Image Retrieval (CBIR). Most existing ROI identification methods are based on salience detection, and the identified ROIs may not be the regions that users are really interested in. While manual selection of ROIs can directly reflect users’ interests, it puts extra cognitive overhead to users. To alleviate these limitations, in this paper, we propose a novel eye-tracking based method to detect ROIs for CBIR, in an unobtrusive way. Experimental results have demonstrated that our model performed effectively compared with various state of the art methods
Query Region Determination based on Region Importance Index and Relative Position for Region-based Image Retrieval
An efficient
Region-Based Image Retrieval (RBIR) system must consider query region
determination techniques and target regions in the retrieval process. A query region is a region
that must contain
a Region of Interest (ROI) or saliency region. A query region determination can be specified
manually or automatically. However, manual determination is considered less
efficient and tedious for users. The selected query region must determine specific
target regions in the image collection to reduce the retrieval time. This study
proposes a strategy of query region determination based on the Region
Importance Index (RII) value and relative position of the Saliency Region
Overlapping Block (SROB) to produce a more efficient RBIR. The entire region is
formed by using the mean shift segmentation method. The RII value is calculated
based on a percentage of the region area and region distance to the center of
the image. Whereas
the target regions are determined by considering the relative position of SROB,
the performance of the proposed method is tested on a CorelDB dataset.
Experimental results show that the proposed method can reduce the Average of
Retrieval Time to 0.054 seconds with a 5x5 block size configuration
Conditional Attention for Content-based Image Retrieval
Deep learning based feature extraction combined with visual attention mechanism is shown to provide good results in content-based image retrieval (CBIR). Ideally, CBIR should rely on regions which contain objects of interest that appear in the query image. However, most existing attention models just predict the most likely region of interest based on the knowledge learned from the training dataset regardless of the content in the query image. As a result, they may look towards contexts outside the object of interest, especially when there are multiple potential objects of interest in a given image. In this paper, we propose a conditional attention model which is sensitive to the input query image content and can generate more accurate attention maps. A key-point detection and description based method is proposed for training data generation. Consequently, our model does not require any additional attention label for training. The proposed attention model enables the spatial pooling feature extraction method (generalized mean pooling) improves image feature representation and leads to better image retrieval performance. The proposed framework is tested on a series of databases where it is shown to perform well in challenging situations
- …