6 research outputs found
Common-Frame Model for Object Recognition
A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing
a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed.
We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on
each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4].
We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We
compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter
Contextual Bag-Of-Visual-Words and ECOC-Rank for Retrieval and Multi-class Object Recognition
Projecte Final de Mà ster UPC realitzat en col.laboració amb Dept. Matemà tica Aplicada i Anà lisi, Universitat de BarcelonaMulti-class object categorization is an important line of research in Computer Vision
and Pattern Recognition fields. An artificial intelligent system is able to interact with its environment if it is able to distinguish among a set of cases, instances, situations, objects, etc. The World is inherently multi-class, and thus, the eficiency
of a system can be determined by its accuracy discriminating among a set of cases.
A recently applied procedure in the literature is the Bag-Of-Visual-Words (BOVW).
This methodology is based on the natural language processing theory, where a set of
sentences are defined based on word frequencies. Analogy, in the pattern recognition
domain, an object is described based on the frequency of its parts appearance.
However, a general drawback of this method is that the dictionary construction
does not take into account geometrical information about object parts. In order to
include parts relations in the BOVW model, we propose the Contextual BOVW
(C-BOVW), where the dictionary construction is guided by a geometricaly-based
merging procedure. As a result, objects are described as sentences where geometrical
information is implicitly considered.
In order to extend the proposed system to the multi-class case, we used the
Error-Correcting Output Codes framework (ECOC). State-of-the-art multi-class
techniques are frequently defined as an ensemble of binary classifiers. In this sense, the ECOC framework, based on error-correcting principles, showed to be a powerful tool, being able to classify a huge number of classes at the same time that corrects classification errors produced by the individual learners.
In our case, the C-BOVW sentences are learnt by means of an ECOC configuration, obtaining high discriminative power. Moreover, we used the ECOC outputs obtained by the new methodology to rank classes. In some situations, more than
one label is required to work with multiple hypothesis and find similar cases, such
as in the well-known retrieval problems. In this sense, we also included contextual
and semantic information to modify the ECOC outputs and defined an ECOC-rank methodology. Altering the ECOC output values by means of the adjacency of
classes based on features and classes relations based on ontologies, we also reporteda significant improvement in class-retrieval problems