12,934 research outputs found
Multiple Instance Learning: A Survey of Problem Characteristics and Applications
Multiple instance learning (MIL) is a form of weakly supervised learning
where training instances are arranged in sets, called bags, and a label is
provided for the entire bag. This formulation is gaining interest because it
naturally fits various problems and allows to leverage weakly labeled data.
Consequently, it has been used in diverse application fields such as computer
vision and document classification. However, learning from bags raises
important challenges that are unique to MIL. This paper provides a
comprehensive survey of the characteristics which define and differentiate the
types of MIL problems. Until now, these problem characteristics have not been
formally identified and described. As a result, the variations in performance
of MIL algorithms from one data set to another are difficult to explain. In
this paper, MIL problem characteristics are grouped into four broad categories:
the composition of the bags, the types of data distribution, the ambiguity of
instance labels, and the task to be performed. Methods specialized to address
each category are reviewed. Then, the extent to which these characteristics
manifest themselves in key MIL application areas are described. Finally,
experiments are conducted to compare the performance of 16 state-of-the-art MIL
methods on selected problem characteristics. This paper provides insight on how
the problem characteristics affect MIL algorithms, recommendations for future
benchmarking and promising avenues for research
Supervised semantic labeling of places using information extracted from sensor data
Indoor environments can typically be divided into places with different functionalities like corridors, rooms or doorways. The ability to learn such semantic categories from sensor data enables a mobile robot to extend the representation of the environment facilitating interaction with humans. As an example, natural language terms like âcorridorâ or âroomâ can be used to communicate the position of the robot in a map in a more intuitive way. In this work, we first propose an approach based on supervised learning to classify the pose of a mobile robot into semantic classes. Our method uses AdaBoost to boost simple features extracted from sensor range data into a strong classifier. We present two main applications of this approach. Firstly, we show how our approach can be utilized by a moving robot for an online classification of the poses traversed along its path using a hidden Markov model. In this case we additionally use as features objects extracted from images. Secondly, we introduce an approach to learn topological maps from geometric maps by applying our semantic classification procedure in combination with a probabilistic relaxation method. Alternatively, we apply associative Markov networks to classify geometric maps and compare the results with a relaxation approach. Experimental results obtained in simulation and with real robots demonstrate the effectiveness of our approach in various indoor environments
On Classification with Bags, Groups and Sets
Many classification problems can be difficult to formulate directly in terms
of the traditional supervised setting, where both training and test samples are
individual feature vectors. There are cases in which samples are better
described by sets of feature vectors, that labels are only available for sets
rather than individual samples, or, if individual labels are available, that
these are not independent. To better deal with such problems, several
extensions of supervised learning have been proposed, where either training
and/or test objects are sets of feature vectors. However, having been proposed
rather independently of each other, their mutual similarities and differences
have hitherto not been mapped out. In this work, we provide an overview of such
learning scenarios, propose a taxonomy to illustrate the relationships between
them, and discuss directions for further research in these areas
SkyDOT (Sky Database for Objects in the Time Domain): A Virtual Observatory for Variability Studies at LANL
The mining of Virtual Observatories (VOs) is becoming a powerful new method
for discovery in astronomy. Here we report on the development of SkyDOT (Sky
Database for Objects in the Time domain), a new Virtual Observatory, which is
dedicated to the study of sky variability. The site will confederate a number
of massive variability surveys and enable exploration of the time domain in
astronomy. We discuss the architecture of the database and the functionality of
the user interface. An important aspect of SkyDOT is that it is continuously
updated in near real time so that users can access new observations in a timely
manner. The site will also utilize high level machine learning tools that will
allow sophisticated mining of the archive. Another key feature is the real time
data stream provided by RAPTOR (RAPid Telescopes for Optical Response), a new
sky monitoring experiment under construction at Los Alamos National Laboratory
(LANL).Comment: to appear in SPIE proceedings vol. 4846, 11 pages, 5 figure
A Comparison of Multi-instance Learning Algorithms
Motivated by various challenging real-world applications, such as drug activity prediction and image retrieval, multi-instance (MI) learning has attracted considerable interest in recent years. Compared with standard supervised learning, the MI learning task is more difficult as the label information of each training example is incomplete. Many MI algorithms have been proposed. Some of them are specifically designed for MI problems whereas others have been upgraded or adapted from standard single-instance learning algorithms. Most algorithms have been evaluated on only one or two benchmark datasets, and there is a lack of systematic comparisons of MI learning algorithms.
This thesis presents a comprehensive study of MI learning algorithms that aims to compare their performance and find a suitable way to properly address different MI problems. First, it briefly reviews the history of research on MI learning. Then it discusses five general classes of MI approaches that cover a total of 16 MI algorithms. After that, it presents empirical results for these algorithms that were obtained from 15 datasets which involve five different real-world application domains. Finally, some conclusions are drawn from these results: (1) applying suitable standard single-instance learners to MI problems can often generate the best result on the datasets that were tested, (2) algorithms exploiting the standard asymmetric MI assumption do not show significant advantages over approaches using the so-called collective assumption, and (3) different MI approaches are suitable for different application domains, and no MI algorithm works best on all MI problems
- âŠ