1,073 research outputs found
No Spare Parts: Sharing Part Detectors for Image Categorization
This work aims for image categorization using a representation of distinctive
parts. Different from existing part-based work, we argue that parts are
naturally shared between image categories and should be modeled as such. We
motivate our approach with a quantitative and qualitative analysis by
backtracking where selected parts come from. Our analysis shows that in
addition to the category parts defining the class, the parts coming from the
background context and parts from other image categories improve categorization
performance. Part selection should not be done separately for each category,
but instead be shared and optimized over all categories. To incorporate part
sharing between categories, we present an algorithm based on AdaBoost to
jointly optimize part sharing and selection, as well as fusion with the global
image representation. We achieve results competitive to the state-of-the-art on
object, scene, and action categories, further improving over deep convolutional
neural networks
Information extraction from multimedia web documents: an open-source platform and testbed
The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval
Multi-view alignment with database of features for an improved usage of high-end 3D scanners
The usability of high-precision and high-resolution 3D scanners is of crucial importance due to the increasing demand of 3D data in both professional and general-purpose applications. Simplified, intuitive and rapid object modeling requires effective and automated alignment pipelines capable to trace back each independently acquired range image of the scanned object into a common reference system. To this end, we propose a reliable and fast feature-based multiple-view alignment pipeline that allows interactive registration of multiple views according to an unchained acquisition procedure. A robust alignment of each new view is estimated with respect to the previously aligned data through fast extraction, representation and matching of feature points detected in overlapping areas from different views. The proposed pipeline guarantees a highly reliable alignment of dense range image datasets on a variety of objects in few seconds per million of points
Computational Modelling of Information Gathering
This thesis describes computational modelling of information gathering behaviour under active inference ā a framework for describing Bayes optimal behaviour. Under active inference perception, attention and action all serve for same purpose: minimising variational free energy. Variational free energy is an upper bound on surprise and minimising it maximises an agentās evidence for its survival. An agent achieves this by acquiring information (resolving uncertainty) about the hidden states of the world and uses the acquired information to act on the outcomes it prefers. In this work I placed special emphasis on the resolution of uncertainty about the states of the world. I first created a visual search task called scene construction task. In this task one needs to accumulate evidence for competing hypotheses (different visual scenes) through sequential sampling of a visual scene and categorising it once there is sufficient evidence. I showed that a computational agent attends to the most salient (epistemically valuable) locations in this task. In the next, this task was performed by healthy humans. Healthy peopleās exploration strategies provided evidence for uncertainty driven exploration. I also showed how different exploratory behaviours can be characterised using canonical correlation analysis. In the next study I showed how exploration of a visual scene under different instructions could be explained by appealing to the computational mechanisms that may correspond to attention. This entailed manipulating the precision of task irrelevant cues and their hidden causes as a function of instructions. In the final work, I was interested in characterising impulsive behaviour using a patch leaving paradigm. By varying the parameters of the MDP model, I showed that there could be at least three distinct causes of impulsive behaviour, namely a lower depth of planning, a lower capacity to maintain and process information, and an increased perceived value of immediate rewards
- ā¦