3,664 research outputs found
Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems
A growing number of applications, e.g. video surveillance and medical image
analysis, require training recognition systems from large amounts of weakly
annotated data while some targeted interactions with a domain expert are
allowed to improve the training process. In such cases, active learning (AL)
can reduce labeling costs for training a classifier by querying the expert to
provide the labels of most informative instances. This paper focuses on AL
methods for instance classification problems in multiple instance learning
(MIL), where data is arranged into sets, called bags, that are weakly labeled.
Most AL methods focus on single instance learning problems. These methods are
not suitable for MIL problems because they cannot account for the bag structure
of data. In this paper, new methods for bag-level aggregation of instance
informativeness are proposed for multiple instance active learning (MIAL). The
\textit{aggregated informativeness} method identifies the most informative
instances based on classifier uncertainty, and queries bags incorporating the
most information. The other proposed method, called \textit{cluster-based
aggregative sampling}, clusters data hierarchically in the instance space. The
informativeness of instances is assessed by considering bag labels, inferred
instance labels, and the proportion of labels that remain to be discovered in
clusters. Both proposed methods significantly outperform reference methods in
extensive experiments using benchmark data from several application domains.
Results indicate that using an appropriate strategy to address MIAL problems
yields a significant reduction in the number of queries needed to achieve the
same level of performance as single instance AL methods
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
A Training Framework of Robotic Operation and Image Analysis for Decision-Making in Bridge Inspection and Preservation
This project aims to create a framework of training engineers and policy makers on robotic operation and image analysis for the inspection and preservation of transportation infrastructure. Specifically, it develops the method for collecting camera-based bridge inspection data and the algorithms for data processing and pattern recognitions; and it creates tools for assisting users on visually analyzing the processed image data and recognized patterns for inspection and preservation decision-making.
The project first developed a Siamese Neural Network to support bridge engineers in analyzing big video data. The network was initially trained by one-shot learning and is fine-tuned iteratively with human in the loop. Bridge engineers define the region of interest initially, then the algorithm retrieves all related regions in the video, which facilitates the engineers to inspect the bridge rather than exhaustively check every frame of the video. Our neural network was evaluated on three bridge inspection videos with promising performances.
Then, the project developed an assistive intelligence system to facilitate inspectors efficiently and accurately detect and segment multiclass bridge elements from inspection videos. A Mask Region-based Convolutional Neural Network was transferred in the studied problem with a small initial training dataset labeled by the inspector. Then, the temporal coherence analysis was used to recover false negative detections of the transferred network. Finally, self-training with a guidance from experienced inspectors was used to iteratively refine the network. Results from a case study have demonstrated that the proposed method uses just a small amount of time and guidance from experienced inspectors to successfully build the assistive intelligence system with an excellent performance
- …