Search CORE

993 research outputs found

Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlation and Semantic Spaces

Author: Hare Jonathan
Lewis Paul
Publication venue
Publication date: 04/02/2010
Field of study

This paper proposes a new technique for auto-annotation and semantic retrieval based upon the idea of linearly mapping an image feature space to a keyword space. The new technique is compared to several related techniques, and a number of salient points about each of the techniques are discussed and contrasted. The paper also discusses how these techniques might actually scale to a real-world retrieval problem, and demonstrates this though a case study of a semantic retrieval technique being used on a real-world data-set (with a mix of annotated and unannotated images) from a picture library

CiteSeerX

Southampton (e-Prints Soton)

Crowd counting using group tracking and local features

Author: Denman Simon
Fookes Clinton
Ryan David
Sridharan Sridha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

In public venues, crowd size is a key indicator of crowd safety and stability. In this paper we propose a crowd counting algorithm that uses tracking and local features to count the number of people in each group as represented by a foreground blob segment, so that the total crowd estimate is the sum of the group sizes. Tracking is employed to improve the robustness of the estimate, by analysing the history of each group, including splitting and merging events. A simplified ground truth annotation strategy results in an approach with minimal setup requirements that is highly accurate

Queensland University of Technology ePrints Archive

A Novel Semantic Statistical Model for Automatic Image Annotation Using the Relationship between the Regions Based on Multi-Criteria Decision Making

Author: Deljooi Hengame
Eskandari Ahmad Reza
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/02/2014
Field of study

Automatic image annotation has emerged as an important research topic due to the existence of the semantic gap and in addition to its potential application on image retrieval and management. In this paper we present an approach which combines regional contexts and visual topics to automatic image annotation. Regional contexts model the relationship between the regions, whereas visual topics provide the global distribution of topics over an image. Conventional image annotation methods neglected the relationship between the regions in an image, while these regions are exactly explanation of the image semantics, therefore considering the relationship between them are helpful to annotate the images. The proposed model extracts regional contexts and visual topics from the image, and incorporates them by MCDM (Multi Criteria Decision Making) approach based on TOPSIS (Technique for Order Preference by Similarity to the Ideal Solution) method. Regional contexts and visual topics are learned by PLSA (Probability Latent Semantic Analysis) from the training data. The experiments on 5k Corel images show that integrating these two kinds of information is beneficial to image annotation.DOI:http://dx.doi.org/10.11591/ijece.v4i1.459

IAES journal

Institute of Advanced Engineering and Science

Automatic image analysis for gene expression patterns of fly embryos

Author: Eisen Michael B
Leung Garmay
Long Fuhui
Myers Eugene W
Peng Hanchuan
Zhou Jie
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Springer - Publisher Connector

PubMed Central

Interactive Video Annotation Tool

Author: García Jesús
Molina José M.
Patricio Guisado Miguel Ángel
Serrano Miguel Á.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Proceedings of: Forth International Workshop on User-Centric Technologies and applications (CONTEXTS 2010). Valencia, 7-10 September , 2010.Abstract: Increasingly computer vision discipline needs annotated video databases to realize assessment tasks. Manually providing ground truth data to multimedia resources is a very expensive work in terms of effort, time and economic resources. Automatic and semi-automatic video annotation and labeling is the faster and more economic way to get ground truth for quite large video collections. In this paper, we describe a new automatic and supervised video annotation tool. Annotation tool is a modified version of ViPER-GT tool. ViPER-GT standard version allows manually editing and reviewing video metadata to generate assessment data. Automatic annotation capability is possible thanks to an incorporated tracking system which can deal the visual data association problem in real time. The research aim is offer a system which enables spends less time doing valid assessment models.Publicad

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Hybrid image representation methods for automatic image annotation: a survey

Author: Bechkoum Kamal
Benblidia Nadjia
Bouyerbou Hafidha
Oukid Saliha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2012
Field of study

In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

Crossref

NECTAR

Image annotation and retrieval based on multi-modal feature clustering and similarity propagation.

Author: Ismail Mohamed Maher Ben, 1979-
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/05/2011
Field of study

The performance of content-based image retrieval systems has proved to be inherently constrained by the used low level features, and cannot give satisfactory results when the user\u27s high level concepts cannot be expressed by low level features. In an attempt to bridge this semantic gap, recent approaches started integrating both low level-visual features and high-level textual keywords. Unfortunately, manual image annotation is a tedious process and may not be possible for large image databases. In this thesis we propose a system for image retrieval that has three mains components. The first component of our system consists of a novel possibilistic clustering and feature weighting algorithm based on robust modeling of the Generalized Dirichlet (GD) finite mixture. Robust estimation of the mixture model parameters is achieved by incorporating two complementary types of membership degrees. The first one is a posterior probability that indicates the degree to which a point fits the estimated distribution. The second membership represents the degree of typicality and is used to indentify and discard noise points. Robustness to noisy and irrelevant features is achieved by transforming the data to make the features independent and follow Beta distribution, and learning optimal relevance weight for each feature subset within each cluster. We extend our algorithm to find the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. We also outline a semi-supervised version of the proposed algorithm. In the second component of our system consists of a novel approach to unsupervised image annotation. Our approach is based on: (i) the proposed semi-supervised possibilistic clustering; (ii) a greedy selection and joining algorithm (GSJ); (iii) Bayes rule; and (iv) a probabilistic model that is based on possibilistic memebership degrees to annotate an image. The third component of the proposed system consists of an image retrieval framework based on multi-modal similarity propagation. The proposed framework is designed to deal with two data modalities: low-level visual features and high-level textual keywords generated by our proposed image annotation algorithm. The multi-modal similarity propagation system exploits the mutual reinforcement of relational data and results in a nonlinear combination of the different modalities. Specifically, it is used to learn the semantic similarities between images by leveraging the relationships between features from the different modalities. The proposed image annotation and retrieval approaches are implemented and tested with a standard benchmark dataset. We show the effectiveness of our clustering algorithm to handle high dimensional and noisy data. We compare our proposed image annotation approach to three state-of-the-art methods and demonstrate the effectiveness of the proposed image retrieval system

University of Louisville

Recommended from our members

Explainable and Advisable Learning for Self-driving Vehicles

Author: Kim Jinkyu
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly

eScholarship - University of California

A novel real-time computational framework for detecting catheters and rigid guidewires in cardiac catheterization procedures

Author: Alhrishy Mazen
Ma YingLiang
Mountney Peter
Narayan Srinivas Ananth
Rhode Kawal S
Publication venue: 'Wiley'
Publication date: 01/11/2018
Field of study

Purpose: Catheters and guidewires are used extensively in cardiac catheterization procedures such as heart arrhythmia treatment (ablation), angioplasty and congenital heart disease treatment. Detecting their positions in fluoroscopic X-ray images is important for several clinical applications, for example, motion compensation, co-registration between 2D and 3D imaging modalities and 3D object reconstruction. Methods: For the generalized framework, a multiscale vessel enhancement filter is first used to enhance the visibility of wire-like structures in the X-ray images. After applying adaptive binarization method, the centerlines of wire-like objects were extracted. Finally, the catheters and guidewires were detected as a smooth path which is reconstructed from centerlines of target wire-like objects. In order to classify electrode catheters which are mainly used in electrophysiology procedures, additional steps were proposed. First, a blob detection method, which is embedded in vessel enhancement filter with no additional computational cost, localizes electrode positions on catheters. Then the type of electrode catheters can be recognized by detecting the number of electrodes and also the shape created by a series of electrodes. Furthermore, for detecting guiding catheters or guidewires, a localized machine learning algorithm is added into the framework to distinguish between target wire objects and other wire-like artifacts. The proposed framework were tested on total 10,624 images which are from 102 image sequences acquired from 63 clinical cases. Results: Detection errors for the coronary sinus (CS) catheter, lasso catheter ring and lasso catheter body are 0.56 ± 0.28 mm, 0.64 ± 0.36 mm and 0.66 ± 0.32 mm, respectively, as well as success rates of 91.4%, 86.3% and 84.8% were achieved. Detection errors for guidewires and guiding catheters are 0.62 ± 0.48 mm and success rates are 83.5%. Conclusion: The proposed computational framework do not require any user interaction or prior models and it can detect multiple catheters or guidewires simultaneously and in real-time. The accuracy of the proposed framework is sub-mm and the methods are robust toward low-dose X-ray fluoroscopic images, which are mainly used during procedures to maintain low radiation dose

Crossref

Coventry University Pure Portal

King's Research Portal

University of East Anglia digital repository

Stacked Denoising Autoencoders and Transfer Learning for Immunogold Particles Detection and Recognition

Author: Alexandre Luís A.
de Sá Joaquim Marques
Esteves Tiago
Figueiredo Francisco
Monjardino Paulo
Quelhas Pedro
Rocha Sara
Santos Jorge M.
Silva Luís M.
Sousa Ricardo Gamelas
Publication venue
Publication date: 07/12/2017
Field of study

In this paper we present a system for the detection of immunogold particles and a Transfer Learning (TL) framework for the recognition of these immunogold particles. Immunogold particles are part of a high-magnification method for the selective localization of biological molecules at the subcellular level only visible through Electron Microscopy. The number of immunogold particles in the cell walls allows the assessment of the differences in their compositions providing a tool to analise the quality of different plants. For its quantization one requires a laborious manual labeling (or annotation) of images containing hundreds of particles. The system that is proposed in this paper can leverage significantly the burden of this manual task. For particle detection we use a LoG filter coupled with a SDA. In order to improve the recognition, we also study the applicability of TL settings for immunogold recognition. TL reuses the learning model of a source problem on other datasets (target problems) containing particles of different sizes. The proposed system was developed to solve a particular problem on maize cells, namely to determine the composition of cell wall ingrowths in endosperm transfer cells. This novel dataset as well as the code for reproducing our experiments is made publicly available. We determined that the LoG detector alone attained more than 84\% of accuracy with the F-measure. Developing immunogold recognition with TL also provided superior performance when compared with the baseline models augmenting the accuracy rates by 10\%

arXiv.org e-Print Archive

UBibliorum repositorio digital da ubi