1,201 research outputs found
A transversal approach for patch-based label fusion via matrix completion
Recently, multi-atlas patch-based label fusion has received an increasing interest in the medical image segmentation field. After warping the anatomical labels from the atlas images to the target image by registration, label fusion is the key step to determine the latent label for each target image point. Two popular types of patch-based label fusion approaches are (1) reconstruction-based approaches that compute the target labels as a weighted average of atlas labels, where the weights are derived by reconstructing the target image patch using the atlas image patches; and (2) classification-based approaches that determine the target label as a mapping of the target image patch, where the mapping function is often learned using the atlas image patches and their corresponding labels. Both approaches have their advantages and limitations. In this paper, we propose a novel patch-based label fusion method to combine the above two types of approaches via matrix completion (and hence, we call it transversal). As we will show, our method overcomes the individual limitations of both reconstruction-based and classification-based approaches. Since the labeling confidences may vary across the target image points, we further propose a sequential labeling framework that first labels the highly confident points and then gradually labels more challenging points in an iterative manner, guided by the label information determined in the previous iterations. We demonstrate the performance of our novel label fusion method in segmenting the hippocampus in the ADNI dataset, subcortical and limbic structures in the LONI dataset, and mid-brain structures in the SATA dataset. We achieve more accurate segmentation results than both reconstruction-based and classification-based approaches. Our label fusion method is also ranked 1st in the online SATA Multi-Atlas Segmentation Challenge
Sparse and low rank approximations for action recognition
Action recognition is crucial area of research in computer vision with wide range of
applications in surveillance, patient-monitoring systems, video indexing, Human-
Computer Interaction and many more. These applications require automated
action recognition. Robust classification methods are sought-after despite influential
research in this field over past decade. The data resources have grown
tremendously owing to the advances in the digital revolution which cannot be
compared to the meagre resources in the past. The main limitation on a system
when dealing with video data is the computational burden due to large dimensions
and data redundancy. Sparse and low rank approximation methods have evolved
recently which aim at concise and meaningful representation of data. This thesis
explores the application of sparse and low rank approximation methods in the
context of video data classification with the following contributions.
1. An approach for solving the problem of action and gesture classification is
proposed within the sparse representation domain, effectively dealing with
large feature dimensions,
2. Low rank matrix completion approach is proposed to jointly classify more
than one action
3. Deep features are proposed for robust classification of multiple actions
within matrix completion framework which can handle data deficiencies.
This thesis starts with the applicability of sparse representations based classifi-
cation methods to the problem of action and gesture recognition. Random projection
is used to reduce the dimensionality of the features. These are referred
to as compressed features in this thesis. The dictionary formed with compressed
features has proved to be efficient for the classification task achieving comparable
results to the state of the art.
Next, this thesis addresses the more promising problem of simultaneous classifi-
cation of multiple actions. This is treated as matrix completion problem under
transduction setting. Matrix completion methods are considered as the generic
extension to the sparse representation methods from compressed sensing point
of view. The features and corresponding labels of the training and test data are
concatenated and placed as columns of a matrix. The unknown test labels would
be the missing entries in that matrix. This is solved using rank minimization
techniques based on the assumption that the underlying complete matrix would
be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities.
This thesis then extends the matrix completion framework for joint classification
of actions to handle the missing features besides missing test labels. In
this context, deep features from a convolutional neural network are proposed.
A convolutional neural network is trained on the training data and features are
extracted from train and test data from the trained network. The performance
of the deep features has proved to be promising when compared to the state of
the art hand-crafted features
Image Understanding by Socializing the Semantic Gap
Several technological developments like the Internet, mobile devices and Social Networks have spurred the sharing of images in unprecedented volumes, making tagging and commenting a common habit. Despite the recent progress in image analysis, the problem of Semantic Gap still hinders machines in fully understand the rich semantic of a shared photo. In this book, we tackle this problem by exploiting social network contributions. A comprehensive treatise of three linked problems on image annotation is presented, with a novel experimental protocol used to test eleven state-of-the-art methods. Three novel approaches to annotate, under stand the sentiment and predict the popularity of an image are presented. We conclude with the many challenges and opportunities ahead for the multimedia community
Joint Session-Item Encoding for Session-Based Recommendation: A Metric- Learning Approach with Temporal Smoothing
In recommendation systems, a system is in charge of providing relevant recommendations
towards users with either a clear target in mind or a mere vague mental representation.
Session-based recommendation targets a specific scenario in recommendation systems,
where users are anonymous. Thus the recommendation system must work under more
challenging conditions, having only the current session to extract any user preferences to
provide recommendations.
This setting requires a model capable of understanding and relating different inter-
actions across different sessions involving different items. This dissertation reflects such
relationships on a commonly learned space for sessions and items. Such space is built
using metric-learning, which can capture such relationships and build such space, where
the distances between the elements (session and item embeddings) reflect how they relate
to each other. We then use this learned space as the intermediary to provide relevant rec-
ommendations. This work continues and extends on top of other relevant work showing
the potential of metric-learning addressed to the session-based recommendation field.
This dissertation proposes three significant contributions: (i) propose a novel joint
session-item encoding model with temporal smoothing, with fewer parameters and the
inclusion of temporal characteristics in learning (temporal proximity and temporal re-
cency); (ii) enhanced recommendation performance surpassing other state-of-the-art
metric-learning models for session-based recommendation; (iii) a thorough critical analy-
sis, addressing and raising awareness to common problems in the field of session-based
recommendation, discussing the reasons behind them and their impact on model perfor-
mance.Em sistemas de recomendação, um sistema fica encarregue de fornecer recomendações
relevantes aos seus utilizadores que podem ter, ou uma ideia concreta daquilo que pre-
tendem ou apenas uma vaga representação mental. Recomendação com base na sessão
dirige-se principalmente a um cenário específico de sistemas de recomendação, onde
os utilizadores são anónimos. Ou seja, estes sistemas têm de ser capazes de funcionar
em condições mais desfavoráveis, tendo apenas a sessão atual disponível como input do
utilizador para efetuar recomendações.
Este contexto requer um modelo capaz de perceber e relacionar diferentes interações
ao longo de várias outras sessões envolvendo diferentes itens. Esta dissertação reflete
tais interações por via de um espaço comum, que é aprendido, para representar sessões e
itens. Este espaço é construído usando metric-learning, técnica que consegue capturar tais
relações e construir o espaço em questão, no qual a distância entre os vários elementos
(embeddings de sessões e itens) reflete como estes se relacionam entre si. Usamos este
espaço, que foi aprendido, como intermediário no fornecimento de recomendações rele-
vantes. Este trabalho continua e extende para além de outros trabalhos relevantes na área
que mostraram o potencial de aplicar metric-learning para o domínio de recomendação
com base na sessão.
Esta dissertação propõe as seguintes três principais e significativas contribuições: (i)
propõe um novo modelo de codificação sessão-item conjunto com suavização temporal,
com menos parâmetros e com a inclusão de características temporais no processo de
aprendizagem (proximidade temporal e recência); (ii) um desempenho de recomenda-
ção melhorado que ultrapassa outros métodos do estado-da-arte que utilizam técnicas
de metric-learning para sistemas de recomendação com base na sessão; (iii) uma análise
cuidada, que foca e tenta destacar alguns erros comuns neste campo de sistemas de re-
comendação com base na sessão, discutindo as razões por detrás de tais erros e o seu
impacto no desempenho dos modelos
- …