1,201 research outputs found

    A transversal approach for patch-based label fusion via matrix completion

    Get PDF
    Recently, multi-atlas patch-based label fusion has received an increasing interest in the medical image segmentation field. After warping the anatomical labels from the atlas images to the target image by registration, label fusion is the key step to determine the latent label for each target image point. Two popular types of patch-based label fusion approaches are (1) reconstruction-based approaches that compute the target labels as a weighted average of atlas labels, where the weights are derived by reconstructing the target image patch using the atlas image patches; and (2) classification-based approaches that determine the target label as a mapping of the target image patch, where the mapping function is often learned using the atlas image patches and their corresponding labels. Both approaches have their advantages and limitations. In this paper, we propose a novel patch-based label fusion method to combine the above two types of approaches via matrix completion (and hence, we call it transversal). As we will show, our method overcomes the individual limitations of both reconstruction-based and classification-based approaches. Since the labeling confidences may vary across the target image points, we further propose a sequential labeling framework that first labels the highly confident points and then gradually labels more challenging points in an iterative manner, guided by the label information determined in the previous iterations. We demonstrate the performance of our novel label fusion method in segmenting the hippocampus in the ADNI dataset, subcortical and limbic structures in the LONI dataset, and mid-brain structures in the SATA dataset. We achieve more accurate segmentation results than both reconstruction-based and classification-based approaches. Our label fusion method is also ranked 1st in the online SATA Multi-Atlas Segmentation Challenge

    Sparse and low rank approximations for action recognition

    Get PDF
    Action recognition is crucial area of research in computer vision with wide range of applications in surveillance, patient-monitoring systems, video indexing, Human- Computer Interaction and many more. These applications require automated action recognition. Robust classification methods are sought-after despite influential research in this field over past decade. The data resources have grown tremendously owing to the advances in the digital revolution which cannot be compared to the meagre resources in the past. The main limitation on a system when dealing with video data is the computational burden due to large dimensions and data redundancy. Sparse and low rank approximation methods have evolved recently which aim at concise and meaningful representation of data. This thesis explores the application of sparse and low rank approximation methods in the context of video data classification with the following contributions. 1. An approach for solving the problem of action and gesture classification is proposed within the sparse representation domain, effectively dealing with large feature dimensions, 2. Low rank matrix completion approach is proposed to jointly classify more than one action 3. Deep features are proposed for robust classification of multiple actions within matrix completion framework which can handle data deficiencies. This thesis starts with the applicability of sparse representations based classifi- cation methods to the problem of action and gesture recognition. Random projection is used to reduce the dimensionality of the features. These are referred to as compressed features in this thesis. The dictionary formed with compressed features has proved to be efficient for the classification task achieving comparable results to the state of the art. Next, this thesis addresses the more promising problem of simultaneous classifi- cation of multiple actions. This is treated as matrix completion problem under transduction setting. Matrix completion methods are considered as the generic extension to the sparse representation methods from compressed sensing point of view. The features and corresponding labels of the training and test data are concatenated and placed as columns of a matrix. The unknown test labels would be the missing entries in that matrix. This is solved using rank minimization techniques based on the assumption that the underlying complete matrix would be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities. This thesis then extends the matrix completion framework for joint classification of actions to handle the missing features besides missing test labels. In this context, deep features from a convolutional neural network are proposed. A convolutional neural network is trained on the training data and features are extracted from train and test data from the trained network. The performance of the deep features has proved to be promising when compared to the state of the art hand-crafted features

    Image Understanding by Socializing the Semantic Gap

    Get PDF
    Several technological developments like the Internet, mobile devices and Social Networks have spurred the sharing of images in unprecedented volumes, making tagging and commenting a common habit. Despite the recent progress in image analysis, the problem of Semantic Gap still hinders machines in fully understand the rich semantic of a shared photo. In this book, we tackle this problem by exploiting social network contributions. A comprehensive treatise of three linked problems on image annotation is presented, with a novel experimental protocol used to test eleven state-of-the-art methods. Three novel approaches to annotate, under stand the sentiment and predict the popularity of an image are presented. We conclude with the many challenges and opportunities ahead for the multimedia community

    Joint Session-Item Encoding for Session-Based Recommendation: A Metric- Learning Approach with Temporal Smoothing

    Get PDF
    In recommendation systems, a system is in charge of providing relevant recommendations towards users with either a clear target in mind or a mere vague mental representation. Session-based recommendation targets a specific scenario in recommendation systems, where users are anonymous. Thus the recommendation system must work under more challenging conditions, having only the current session to extract any user preferences to provide recommendations. This setting requires a model capable of understanding and relating different inter- actions across different sessions involving different items. This dissertation reflects such relationships on a commonly learned space for sessions and items. Such space is built using metric-learning, which can capture such relationships and build such space, where the distances between the elements (session and item embeddings) reflect how they relate to each other. We then use this learned space as the intermediary to provide relevant rec- ommendations. This work continues and extends on top of other relevant work showing the potential of metric-learning addressed to the session-based recommendation field. This dissertation proposes three significant contributions: (i) propose a novel joint session-item encoding model with temporal smoothing, with fewer parameters and the inclusion of temporal characteristics in learning (temporal proximity and temporal re- cency); (ii) enhanced recommendation performance surpassing other state-of-the-art metric-learning models for session-based recommendation; (iii) a thorough critical analy- sis, addressing and raising awareness to common problems in the field of session-based recommendation, discussing the reasons behind them and their impact on model perfor- mance.Em sistemas de recomendação, um sistema fica encarregue de fornecer recomendações relevantes aos seus utilizadores que podem ter, ou uma ideia concreta daquilo que pre- tendem ou apenas uma vaga representação mental. Recomendação com base na sessão dirige-se principalmente a um cenário específico de sistemas de recomendação, onde os utilizadores são anónimos. Ou seja, estes sistemas têm de ser capazes de funcionar em condições mais desfavoráveis, tendo apenas a sessão atual disponível como input do utilizador para efetuar recomendações. Este contexto requer um modelo capaz de perceber e relacionar diferentes interações ao longo de várias outras sessões envolvendo diferentes itens. Esta dissertação reflete tais interações por via de um espaço comum, que é aprendido, para representar sessões e itens. Este espaço é construído usando metric-learning, técnica que consegue capturar tais relações e construir o espaço em questão, no qual a distância entre os vários elementos (embeddings de sessões e itens) reflete como estes se relacionam entre si. Usamos este espaço, que foi aprendido, como intermediário no fornecimento de recomendações rele- vantes. Este trabalho continua e extende para além de outros trabalhos relevantes na área que mostraram o potencial de aplicar metric-learning para o domínio de recomendação com base na sessão. Esta dissertação propõe as seguintes três principais e significativas contribuições: (i) propõe um novo modelo de codificação sessão-item conjunto com suavização temporal, com menos parâmetros e com a inclusão de características temporais no processo de aprendizagem (proximidade temporal e recência); (ii) um desempenho de recomenda- ção melhorado que ultrapassa outros métodos do estado-da-arte que utilizam técnicas de metric-learning para sistemas de recomendação com base na sessão; (iii) uma análise cuidada, que foca e tenta destacar alguns erros comuns neste campo de sistemas de re- comendação com base na sessão, discutindo as razões por detrás de tais erros e o seu impacto no desempenho dos modelos
    corecore