2,673 research outputs found
An information-theoretic framework for semantic-multimedia retrieval
This article is set in the context of searching text and image repositories by keyword. We develop a unified probabilistic framework for text, image, and combined text and image retrieval that is based on the detection of keywords (concepts) using automated image annotation technology. Our framework is deeply rooted in information theory and lends itself to use with other media types.
We estimate a statistical model in a multimodal feature space for each possible query keyword. The key element of our framework is to identify feature space transformations that make them comparable in complexity and density. We select the optimal multimodal feature space with a minimum description length criterion from a set of candidate feature spaces that are computed with the average-mutual-information criterion for the text part and hierarchical expectation maximization for the visual part of the data. We evaluate our approach in three retrieval experiments (only text retrieval, only image retrieval, and text combined with image retrieval), verify the framework’s low computational complexity, and compare with existing state-of-the-art ad-hoc models
Component Evolution Analysis in Descriptor Graphs for Descriptor Ranking
This paper presents a method based on graph behaviour analysis for the evaluation
of descriptor graphs (applied to image/video datasets) for descriptor performance
analysis and ranking. Starting from the Erd˝os-R´enyi model on uniform random
graphs, the paper presents results of investigating random geometric graph behaviour
in relation with the appearance of the giant component as a basis for
ranking descriptors based on their clustering properties. We analyse the phase
transition and the evolution of components in such graphs, and based on their
behaviour, the corresponding descriptors are compared, ranked, and validated in
retrieval tests. The goal is to build an evaluation framework where descriptors can
be analysed for automatic feature selection
Using contour information and segmentation for object registration, modeling and retrieval
This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios.
There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections
3D shape matching and registration : a probabilistic perspective
Dense correspondence is a key area in computer vision and medical image analysis. It has applications in registration and shape analysis. In this thesis, we develop a technique to recover dense correspondences between the surfaces of neuroanatomical objects over heterogeneous populations of individuals. We recover dense correspondences based on 3D shape matching. In this thesis, the 3D shape matching problem is formulated under the framework of Markov Random Fields (MRFs). We represent the surfaces of neuroanatomical objects as genus zero voxel-based meshes. The surface meshes are projected into a Markov random field space. The projection carries both geometric and topological information in terms of Gaussian curvature and mesh neighbourhood from the original space to the random field space. Gaussian curvature is projected to the nodes of the MRF, and the mesh neighbourhood structure is projected to the edges. 3D shape matching between two surface meshes is then performed by solving an energy function minimisation problem formulated with MRFs. The outcome of the 3D shape matching is dense point-to-point correspondences. However, the minimisation of the energy function is NP hard. In this thesis, we use belief propagation to perform the probabilistic inference for 3D shape matching. A sparse update loopy belief propagation algorithm adapted to the 3D shape matching is proposed to obtain an approximate global solution for the 3D shape matching problem. The sparse update loopy belief propagation algorithm demonstrates significant efficiency gain compared to standard belief propagation. The computational complexity and convergence property analysis for the sparse update loopy belief propagation algorithm are also conducted in the thesis. We also investigate randomised algorithms to minimise the energy function. In order to enhance the shape matching rate and increase the inlier support set, we propose a novel clamping technique. The clamping technique is realized by combining the loopy belief propagation message updating rule with the feedback from 3D rigid body registration. By using this clamping technique, the correct shape matching rate is increased significantly. Finally, we investigate 3D shape registration techniques based on the 3D shape matching result. Based on the point-to-point dense correspondences obtained from the 3D shape matching, a three-point based transformation estimation technique is combined with the RANdom SAmple Consensus (RANSAC) algorithm to obtain the inlier support set. The global registration approach is purely dependent on point-wise correspondences between two meshed surfaces. It has the advantage that the need for orientation initialisation is eliminated and that all shapes of spherical topology. The comparison of our MRF based 3D registration approach with a state-of-the-art registration algorithm, the first order ellipsoid template, is conducted in the experiments. These show dense correspondence for pairs of hippocampi from two different data sets, each of around 20 60+ year old healthy individuals
Recommended from our members
Brain network mechanisms in learning behavior
The study of learning has been a central focus of psychology and neuroscience since their inception. Cognitive neuroscience’s traditional approach to understanding learn-ing has been to decompose it into discrete cognitive processes with separable and localized underlying neural systems. While this focus on modular cognitive functions for individual brain areas has led to considerable progress, there is increasing evidence that much of learn-ing behavior relies on overlapping cognitive and neural systems, which may be harder to disentangle than previously envisioned. This is not surprising, as the processes underlying learning must involve widespread integration of information from sensory, affective, and motor sources. The standard tools of cognitive neuroscience limit our ability to describe processes that rely on widespread coordination of brain activity. To understand learning, it will be necessary to characterize dynamic co-activation at the circuit level.
In this dissertation, I present three studies that seek to describe the roles of distrib-uted brain networks in learning. I begin by giving an overview of our current understand-ing of multiple forms of learning, describing the neural and computational mechanisms thought to underlie incremental feedback-based learning and flexible episodic memory. I will focus in particular on the difficulties in separating these processes at the cognitive level and in localizing them to individual regions at the neural level. I will then describe recent findings that have begun to characterize the brain’s large-scale network structure, emphasiz-ing the potential roles that distributed networks could play in understanding learning and cognition more generally. I will end the introduction by reviewing current attempts to char-acterize the dynamics of large-scale brain networks, which will be essential for providing a mechanistic link to learning behavior.
Chapter 2 is a study demonstrating that intrinsic connectivity between the hippo-campus and the ventromedial prefrontal cortex, as well as between these regions and dis-tributed brain networks, is related to individual differences in the transfer of learning on a sensory preconditioning task. The hippocampus and ventromedial prefrontal cortex have both been shown to be involved in this type of learning, and this study represents an early attempt to link connectivity between individual regions and broader networks to learning processes.
Chapter 3 is a study that takes advantage of recent developments in mathematical modeling of temporal networks to demonstrate a relationship between large-scale network dynamics and reinforcement learning within individuals. This study shows that the flexibil-ity of network connectivity in the striatum is related to learning performance over time, as well as to individual differences in parameters estimated from computational models of re-inforcement learning. Notably, connectivity between the striatum and visual as well as or-bitofrontal regions increased over the course of the task, which is consistent with an inte-grative role for the region in learning value-based associations. Network flexibility in a dis-tinct set of regions is associated with episodic memory for object images presented during the learning task.
Chapter 4 examines the role of dopamine, a neurotransmitter strongly linked to val-ue updating in reinforcement learning, in the dynamic network changes occurring during learning. Patients with Parkinson’s disease, who experience a loss of dopaminergic neu-rons in the substantia nigra, performed a reversal-learning task while undergoing functional magnetic resonance imaging. Patients were scanned on and off of a dopamine precursor medication (levodopa) in a within-subject design in order to examine the impact of dopa-mine on brain network dynamics during learning. The reversal provided an experimental manipulation of dynamic connectivity, and patients on medication showed greater modula-tion of striatal-cortical connectivity. Similar results were found in a number of regions re-ceiving midbrain projections including the prefrontal cortex and medial temporal lobe. This study indicates that dopamine inputs from the midbrain modulate large-scale network dy-namics during learning, providing a direct link between reinforcement learning theories of value updating and network neuroscience accounts of dynamic connectivity.
Together, these results indicate that large-scale networks play a critical role in multi-ple forms of learning behavior. Each highlights the potential importance of understanding dynamic routing and integration of information across large-scale circuits for our concep-tion of learning and other cognitive processes. Understanding the when, where, and how of this information flow in the brain may provide an alternative or compliment to traditional theories of distinct learning systems. These studies also illustrate challenges in integrating this perspective with established theories in cognitive neuroscience. Chapter 5 will situate the studies in a broader discussion of how brain activity relates to cognition in general, while pointing out current roadblocks and potential ways forward for a cognitive network neuroscience of learning
Content-based retrieval of visual information
In this dissertation, I investigate new approaches relevant to content-based image retrieval techniques. First, the MOD paradigm is proposed, a method for detecting salient points in images. These salient points are specifically designed to enhance image retrieval accuracy by maximizing distinctiveness. Second, the multi-dimensional maximum likelihood similarity measure is presented, which removes a critical limitation in prior research in this area and provides an improved method of comparing image features. Third, a texture classification method based on low dimensional constructed texture features is introduced which have very low computational complexity and would be suitable for real time video understanding or interactive search of very large image databases. The new approaches are tested on well respected international test sets containing representative imagery.UBL - phd migration 201
Physics based supervised and unsupervised learning of graph structure
Graphs are central tools to aid our understanding of biological, physical, and social systems. Graphs also play a key role in representing and understanding the visual world around us, 3D-shapes and 2D-images alike. In this dissertation, I propose the use of physical or natural phenomenon to understand graph structure. I investigate four phenomenon or laws in nature: (1) Brownian motion, (2) Gauss\u27s law, (3) feedback loops, and (3) neural synapses, to discover patterns in graphs
Cloud-Based Benchmarking of Medical Image Analysis
Medical imagin
Recuperação multimodal e interativa de informação orientada por diversidade
Orientador: Ricardo da Silva TorresTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os métodos de Recuperação da Informação, especialmente considerando-se dados multimídia, evoluíram para a integração de múltiplas fontes de evidência na análise de relevância de itens em uma tarefa de busca. Neste contexto, para atenuar a distância semântica entre as propriedades de baixo nível extraídas do conteúdo dos objetos digitais e os conceitos semânticos de alto nível (objetos, categorias, etc.) e tornar estes sistemas adaptativos às diferentes necessidades dos usuários, modelos interativos que consideram o usuário mais próximo do processo de recuperação têm sido propostos, permitindo a sua interação com o sistema, principalmente por meio da realimentação de relevância implícita ou explícita. Analogamente, a promoção de diversidade surgiu como uma alternativa para lidar com consultas ambíguas ou incompletas. Adicionalmente, muitos trabalhos têm tratado a ideia de minimização do esforço requerido do usuário em fornecer julgamentos de relevância, à medida que mantém níveis aceitáveis de eficácia. Esta tese aborda, propõe e analisa experimentalmente métodos de recuperação da informação interativos e multimodais orientados por diversidade. Este trabalho aborda de forma abrangente a literatura acerca da recuperação interativa da informação e discute sobre os avanços recentes, os grandes desafios de pesquisa e oportunidades promissoras de trabalho. Nós propusemos e avaliamos dois métodos de aprimoramento do balanço entre relevância e diversidade, os quais integram múltiplas informações de imagens, tais como: propriedades visuais, metadados textuais, informação geográfica e descritores de credibilidade dos usuários. Por sua vez, como integração de técnicas de recuperação interativa e de promoção de diversidade, visando maximizar a cobertura de múltiplas interpretações/aspectos de busca e acelerar a transferência de informação entre o usuário e o sistema, nós propusemos e avaliamos um método multimodal de aprendizado para ranqueamento utilizando realimentação de relevância sobre resultados diversificados. Nossa análise experimental mostra que o uso conjunto de múltiplas fontes de informação teve impacto positivo nos algoritmos de balanceamento entre relevância e diversidade. Estes resultados sugerem que a integração de filtragem e re-ranqueamento multimodais é eficaz para o aumento da relevância dos resultados e também como mecanismo de potencialização dos métodos de diversificação. Além disso, com uma análise experimental minuciosa, nós investigamos várias questões de pesquisa relacionadas à possibilidade de aumento da diversidade dos resultados e a manutenção ou até mesmo melhoria da sua relevância em sessões interativas. Adicionalmente, nós analisamos como o esforço em diversificar afeta os resultados gerais de uma sessão de busca e como diferentes abordagens de diversificação se comportam para diferentes modalidades de dados. Analisando a eficácia geral e também em cada iteração de realimentação de relevância, nós mostramos que introduzir diversidade nos resultados pode prejudicar resultados iniciais, enquanto que aumenta significativamente a eficácia geral em uma sessão de busca, considerando-se não apenas a relevância e diversidade geral, mas também o quão cedo o usuário é exposto ao mesmo montante de itens relevantes e nível de diversidadeAbstract: Information retrieval methods, especially considering multimedia data, have evolved towards the integration of multiple sources of evidence in the analysis of the relevance of items considering a given user search task. In this context, for attenuating the semantic gap between low-level features extracted from the content of the digital objects and high-level semantic concepts (objects, categories, etc.) and making the systems adaptive to different user needs, interactive models have brought the user closer to the retrieval loop allowing user-system interaction mainly through implicit or explicit relevance feedback. Analogously, diversity promotion has emerged as an alternative for tackling ambiguous or underspecified queries. Additionally, several works have addressed the issue of minimizing the required user effort on providing relevance assessments while keeping an acceptable overall effectiveness. This thesis discusses, proposes, and experimentally analyzes multimodal and interactive diversity-oriented information retrieval methods. This work, comprehensively covers the interactive information retrieval literature and also discusses about recent advances, the great research challenges, and promising research opportunities. We have proposed and evaluated two relevance-diversity trade-off enhancement work-flows, which integrate multiple information from images, such as: visual features, textual metadata, geographic information, and user credibility descriptors. In turn, as an integration of interactive retrieval and diversity promotion techniques, for maximizing the coverage of multiple query interpretations/aspects and speeding up the information transfer between the user and the system, we have proposed and evaluated a multimodal learning-to-rank method trained with relevance feedback over diversified results. Our experimental analysis shows that the joint usage of multiple information sources positively impacted the relevance-diversity balancing algorithms. Our results also suggest that the integration of multimodal-relevance-based filtering and reranking was effective on improving result relevance and also boosted diversity promotion methods. Beyond it, with a thorough experimental analysis we have investigated several research questions related to the possibility of improving result diversity and keeping or even improving relevance in interactive search sessions. Moreover, we analyze how much the diversification effort affects overall search session results and how different diversification approaches behave for the different data modalities. By analyzing the overall and per feedback iteration effectiveness, we show that introducing diversity may harm initial results whereas it significantly enhances the overall session effectiveness not only considering the relevance and diversity, but also how early the user is exposed to the same amount of relevant items and diversityDoutoradoCiência da ComputaçãoDoutor em Ciência da ComputaçãoP-4388/2010140977/2012-0CAPESCNP
- …