Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent Univ., 2013.Thesis (Master's) -- Bilkent University, 2013.Includes bibliographical references leaves 55-61.The Multiple Instance Learning (MIL) paradigm arises to be useful in many application
domains, whereas it is particularly suitable for computer vision problems
due to the difficulty of obtaining manual labeling. Multiple Instance Learning
methods have large applicability to a variety of challenging learning problems
in computer vision, including object recognition and detection, tracking, image
classification, scene classification and more.
As opposed to working with single instances as in standard supervised learning,
Multiple Instance Learning operates over bags of instances. A bag is labeled
as positive if it is known to contain at least one positive instance; otherwise it
is labeled as negative. The overall learning task is to learn a model for some
concept using a training set that is formed of bags. A vital component of using
Multiple Instance Learning in computer vision is its design for abstracting the
visual problem to multi-instance representation, which involves determining what
the bag is and what are the instances in the bag.
In this context, we consider three different computer vision problems and
propose solutions for each of them via novel representations. The first problem
is image retrieval and re-ranking; we propose a method that automatically
constructs multiple candidate Multi-instance bags, which are likely to contain
relevant images. The second problem we look into is recognizing actions from
still images, where we extract several candidate object regions and approach the
problem of identifying related objects from a weakly supervised point of view.
Finally, we address the recognition of human interactions in videos within a MIL
framework. In human interaction recognition, videos may be composed of frames
of different activities, and the task is to identify the interaction in spite of irrelevant
activities that are scattered through the video. To overcome this problem,
we use the idea of Multiple Instance Learning to tackle irrelevant actions in the
whole video sequence classification. Each of the outlined problems are tested
on benchmark datasets of the problems and compared with the state-of-the-art.
The experimental results verify the advantages of the proposed MIL approaches
to these vision problems.Uyanık, ÖzgeM.S