14 research outputs found
Matching sets of features for efficient retrieval and recognition
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 145-153).In numerous domains it is useful to represent a single example by the collection of local features or parts that comprise it. In computer vision in particular, local image features are a powerful way to describe images of objects and scenes. Their stability under variable image conditions is critical for success in a wide range of recognition and retrieval applications. However, many conventional similarity measures and machine learning algorithms assume vector inputs. Comparing and learning from images represented by sets of local features is therefore challenging, since each set may vary in cardinality and its elements lack a meaningful ordering. In this thesis I present computationally efficient techniques to handle comparisons, learning, and indexing with examples represented by sets of features. The primary goal of this research is to design and demonstrate algorithms that can effectively accommodate this useful representation in a way that scales with both the representation size as well as the number of images available for indexing or learning. I introduce the pyramid match algorithm, which efficiently forms an implicit partial matching between two sets of feature vectors.(cont.) The matching has a linear time complexity, naturally forms a Mercer kernel, and is robust to clutter or outlier features, a critical advantage for handling images with variable backgrounds, occlusions, and viewpoint changes. I provide bounds on the expected error relative to the optimal partial matching. For very large databases, even extremely efficient pairwise comparisons may not offer adequately responsive query times. I show how to perform sub-linear time retrievals under the matching measure with randomized hashing techniques, even when input sets have varying numbers of features. My results are focused on several important vision tasks, including applications to content-based image retrieval, discriminative classification for object recognition, kernel regression, and unsupervised learning of categories. I show how the dramatic increase in performance enables accurate and flexible image comparisons to be made on large-scale data sets, and removes the need to artificially limit the number of local descriptions used per image when learning visual categories.by Kristen Lorraine Grauman.Ph.D
Doctor of Philosophy
dissertationDiffusion magnetic resonance imaging (dMRI) has become a popular technique to detect brain white matter structure. However, imaging noise, imaging artifacts, and modeling techniques, etc., create many uncertainties, which may generate misleading information for further analysis or applications, such as surgical planning. Therefore, how to analyze, effectively visualize, and reduce these uncertainties become very important research questions. In this dissertation, we present both rank-k decomposition and direct decomposition approaches based on spherical deconvolution to decompose the fiber directions more accurately for high angular resolution diffusion imaging (HARDI) data, which will reduce the uncertainties of the fiber directions. By applying volume rendering techniques to an ensemble of 3D orientation distribution function (ODF) glyphs, which we call SIP functions of diffusion shapes, one can elucidate the complex heteroscedastic structural variation in these local diffusion shapes. Furthermore, we quantify the extent of this variation by measuring the fraction of the volume of these shapes, which is consistent across all noise levels, the certain volume ratio. To better understand the uncertainties in white matter fiber tracks, we propose three metrics to quantify the differences between the results of diffusion tensor magnetic resonance imaging (DT-MRI) fiber tracking algorithms: the area between corresponding fibers of each bundle, the Earth Mover's Distance (EMD) between two fiber bundle volumes, and the current distance between two fiber bundle volumes. Based on these metrics, we discuss an interactive fiber track comparison visualization toolkit we have developed to visualize these uncertainties more efficiently. Physical phantoms, with high repeatability and reproducibility, are also designed with the hope of validating the dMRI techniques. In summary, this dissertation provides a better understanding about uncertainties in diffusion magnetic resonance imaging: where and how much are the uncertainties? How do we reduce these uncertainties? How can we possibly validate our algorithms
Análise de distribuições de distâncias entre palavras genómicas
The investigation of DNA has been one of the most developed areas of
research in this and in the last century. However, there is a long way to go
to fully understand the DNA code. With the increasing of DNA sequenced
data, mathematical methods play an important role in addressing the need
for e cient quantitative techniques for the detection of regions of interest
and overall characteristics in these sequences.
A feature of interest in the study of genomic words is their spatial distribution
along a DNA sequence, which can be characterized by the distances between
words. Counting such distances provides discrete distributions that may
be analyzed from a statistical point of view. In this work we explore the
distances between genomic words as a mathematical descriptor of DNA
sequences. The main goal is to design, develop and apply statistical methods
specially designed for their distributions, in order to capture information
about the primary and secondary structure of DNA.
The characterization of empirical inter-word distance distributions involves
the problem of the exponential increasing of the number of distributions
as the word length increases, leading to the need of data reduction.
Moreover, if the data can be validly clustered, the class labels may provide
a meaningful description of similarities and di erences between sets of
distributions. Therefore, we explore the inter-word distance distributions
potential to obtain a word clustering, able to highlight similar patterns
of word distributions as well as summarized characteristics of each set of
distributions.
With the aim of performing comparative studies between genomic sequences
and de ning species signatures, we deduce exact distributions of inter-word
distances under random scenarios. Based on these theoretical distributions,
we de ne genomic signatures of species able to discriminate between species
and to capture their evolutionary relation. We presume that the study of
distributions similarities and the clustering procedure allow identifying words
whose distance distribution strongly di ers from a reference distribution or
from the global behaviour of the majority of the words. One of the key topics
of our research focuses on the establishment of procedures that capture
distance distributions with atypical behaviours, herein referred to as atypical
distributions.
In the genomic context, words with an atypical distance distribution may
be related with some biological function (motifs). We expect that our
results may be used to provide some sort of classi cation of sequences,
identifying evolutionary patterns and allowing for the prediction of functional
properties, thereby contributing to the advancement of knowledge about
DNA sequences.A investigação do ADN é uma das áreas mais desenvolvidas neste e no
último século. O crescente aumento do número de genomas sequenciados
tem exigido técnicas quantitativas mais e cientes para a identi cação de
caracterÃsticas gerais e especà cas das sequências genómicas, os métodos
matemáticos desempenham um papel importante na resposta a essa
necessidade.
Uma caracterÃstica com particular interesse no estudo de palavras genómicas
é a sua distribuição espacial ao longo de sequências de ADN, podendo
esta ser caracterizada pelas distâncias entre palavras. A contagem dessas
distâncias fornece distribuições discretas passÃveis de análise estatÃstica.
Neste trabalho, exploramos as distâncias entre palavras como um descritor
matemático das sequências de ADN, tendo como objetivo delinear e
desenvolver procedimentos estatÃsticos especialmente concebidos para o
estudo das suas distribuições.
A caracterização das distribuições de distâncias empÃricas entre palavras
genómicas envolve o problema do crescimento exponencial do número
de distribuições com o aumento do comprimento da palavra, gerando a
necessidade de redução dos dados. Além disso, se os dados puderem
ser validamente agrupados em classes então os representantes de classe
fornecem informação relevante sobre semelhanças e diferenças entre cada
grupo de distribuições. Assim, exploramos o potencial das distribuições de
distâncias na obtenção de um agrupamento de palavras, que agrupe padrões
de distâncias semelhantes e que coloque em evidência as caracterÃsticas de
cada grupo. Com vista ao estudo comparativo de sequências genómicas e
à de nição de assinaturas de espécies, focamo-nos no desenvolvimento de
modelos teóricos que descrevam distribuições de distâncias entre palavras em
cenários aleatórios. Esses modelos são utilizados na de nição de assinaturas
genómicas, capazes de discriminar entre espécies e de recuperar relações
evolutivas entre estas. Presumimos que o estudo de semelhanças e a
análise de agrupamento das distribuições permite identi car palavras cuja
distribuição se afasta fortemente de uma distribuição de referência ou do
comportamento global das maioria das palavras. Um dos principais tópicos
de investigação foca-se na deteção de distribuições com comportamentos
anormais, aqui referidas como distribuições atÃpicas.
No contexto genómico, palavras com distribuições de distâncias atÃpicas
poderão estar relacionadas com alguma função biológica (motivos).
Esperamos que os resultados obtidos possam ser utilizados para fornecer
algum tipo de classi cação de sequências, identi cando padrões evolutivos e
permitindo a previsão das propriedades funcionais, representando assim um
passo adicional na criação de conhecimento sobre sequências de ADN.Programa Doutoral em Matemátic
A combined experimental and computational approach to investigate emergent network dynamics based on large-scale neuronal recordings
Sviluppo di un approccio integrato computazionale-sperimentale per lo studio di reti neuronali mediante registrazioni elettrofisiologich
Robust density modelling using the student's t-distribution for human action recognition
The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE
Localized matching using Earth Mover's Distance towards discovery of common patterns from small image samples
This paper proposes a new approach for the discovery of common patterns in a small set of images by region matching. The issues in feature robustness, match-ing robustness and noise artifact are addressed to delve into the potential of using regions as the basic matching unit. We novelly employ the many-to-many (M2M) matching strategy, specifically with the Earth Mover’s Distance (EMD), to increase resilience towards the structural inconsistency from improper region segmentation. However, the matching pattern of M2M is dispersed and unregulated in nature, lead-ing to the challenges of mining a common pattern while identifying the underlying transformation. To avoid analysis on unregulated matching, we propose localized matching for the collaborative mining of common patterns from multiple images. The patterns are refined iteratively using the expectation-maximization algorithm by taking advantage of the ‘crowding ’ phenomenon in the EMD flows. Experimen-tal results show that our approach can handle images with significant image noise and background clutter. To pinpoint the potential of Common Pattern Discovery (CPD), we further use image retrieval as an example to show the application of CPD for pattern learning in relevance feedback
Mobile Robots Navigation
Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described