Search CORE

51,242 research outputs found

NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences

Author: Cao Zhiguo
Li Chi
Li Xin
Yang Jiaqi
Zhao Chen
Publication venue
Publication date: 30/03/2019
Field of study

Feature correspondence selection is pivotal to many feature-matching based tasks in computer vision. Searching for spatially k-nearest neighbors is a common strategy for extracting local information in many previous works. However, there is no guarantee that the spatially k-nearest neighbors of correspondences are consistent because the spatial distribution of false correspondences is often irregular. To address this issue, we present a compatibility-specific mining method to search for consistent neighbors. Moreover, in order to extract and aggregate more reliable features from neighbors, we propose a hierarchical network named NM-Net with a series of convolution layers taking the generated graph as input, which is insensitive to the order of correspondences. Our experimental results have shown the proposed method achieves the state-of-the-art performance on four datasets with various inlier ratios and varying numbers of feature consistencies.Comment: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019) (oral

arXiv.org e-Print Archive

Video Object Segmentation Without Temporal Information

Author: Caelles Sergi
Chen Yuhua
Cremers Daniel
Leal-Taixé Laura
Maninis Kevis-Kokitsi
Pont-Tuset Jordi
Van Gool Luc
Publication venue
Publication date: 16/05/2018
Field of study

Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly or they may not even produce any result at all. This paper explores the orthogonal approach of processing each frame independently, i.e disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOS-S), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent video segmentation databases, which show that OSVOS-S is both the fastest and most accurate method in the state of the art.Comment: Accepted to T-PAMI. Extended version of "One-Shot Video Object Segmentation", CVPR 2017 (arXiv:1611.05198). Project page: http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos

arXiv.org e-Print Archive

Explainable AI for Trees: From Local Explanations to Global Understanding

Author: Bansal Nisha
Chen Hugh
DeGrave Alex
Erion Gabriel
Himmelfarb Jonathan
Katz Ronit
Lee Su-In
Lundberg Scott M.
Nair Bala
Prutkin Jordan M.
Publication venue
Publication date: 11/05/2019
Field of study

Tree-based machine learning models such as random forests, decision trees, and gradient boosted trees are the most popular non-linear predictive models used in practice today, yet comparatively little attention has been paid to explaining their predictions. Here we significantly improve the interpretability of tree-based models through three main contributions: 1) The first polynomial time algorithm to compute optimal explanations based on game theory. 2) A new type of explanation that directly measures local feature interaction effects. 3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to i) identify high magnitude but low frequency non-linear mortality risk factors in the general US population, ii) highlight distinct population sub-groups with shared risk characteristics, iii) identify non-linear interaction effects among risk factors for chronic kidney disease, and iv) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model's performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains

arXiv.org e-Print Archive

Nasal Patches and Curves for Expression-robust 3D Face Recognition

Author: Emambakhsh Mehryar
Evans Adrian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

The potential of the nasal region for expression robust 3D face recognition is thoroughly investigated by a novel five-step algorithm. First, the nose tip location is coarsely detected and the face is segmented, aligned and the nasal region cropped. Then, a very accurate and consistent nasal landmarking algorithm detects seven keypoints on the nasal region. In the third step, a feature extraction algorithm based on the surface normals of Gabor-wavelet filtered depth maps is utilised and, then, a set of spherical patches and curves are localised over the nasal region to provide the feature descriptors. The last step applies a genetic algorithm-based feature selector to detect the most stable patches and curves over different facial expressions. The algorithm provides the highest reported nasal region-based recognition ranks on the FRGC, Bosphorus and BU-3DFE datasets. The results are comparable with, and in many cases better than, many state-of-the-art 3D face recognition algorithms, which use the whole facial domain. The proposed method does not rely on sophisticated alignment or denoising steps, is very robust when only one sample per subject is used in the gallery, and does not require a training step for the landmarking algorithm. https://github.com/mehryaragha/NoseBiometric

arXiv.org e-Print Archive

Even Trolls Are Useful: Efficient Link Classification in Signed Networks

Author: Falher Géraud Le
Vitale Fabio
Publication venue
Publication date: 29/02/2016
Field of study

We address the problem of classifying the links of signed social networks given their full structural topology. Motivated by a binary user behaviour assumption, which is supported by decades of research in psychology, we develop an efficient and surprisingly simple approach to solve this classification problem. Our methods operate both within the active and batch settings. We demonstrate that the algorithms we developed are extremely fast in both theoretical and practical terms. Within the active setting, we provide a new complexity measure and a rigorous analysis of our methods that hold for arbitrary signed networks. We validate our theoretical claims carrying out a set of experiments on three well known real-world datasets, showing that our methods outperform the competitors while being much faster.Comment: 17 pages, 3 figure

arXiv.org e-Print Archive

Recent Advance in Content-based Image Retrieval: A Literature Survey

Author: Li Houqiang
Tian Qi
Zhou Wengang
Publication venue
Publication date: 02/09/2017
Field of study

The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval. With the ignorance of visual content as a ranking clue, methods with text search techniques for visual retrieval may suffer inconsistency between the text words and visual content. Content-based image retrieval (CBIR), which makes use of the representation of visual content to identify relevant images, has attracted sustained attention in recent two decades. Such a problem is challenging due to the intention gap and the semantic gap problems. Numerous techniques have been developed for content-based image retrieval in the last decade. The purpose of this paper is to categorize and evaluate those algorithms proposed during the period of 2003 to 2016. We conclude with several promising directions for future research.Comment: 22 page

arXiv.org e-Print Archive

Multiple Kernel Learning and Automatic Subspace Relevance Determination for High-dimensional Neuroimaging Data

Author: Ayhan Murat Seckin
Initiative Alzheimer's disease Neuroimaging
Raghavan Vijay
Publication venue
Publication date: 02/06/2017
Field of study

Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated with them. In this paper, we obtain probabilistic biomarkers via Gaussian Processes. Gaussian Processes enable probabilistic kernel machines that offer flexible means to accomplish Multiple Kernel Learning. Exploiting this flexibility, we propose a new variation of Automatic Relevance Determination and tackle the challenges of high dimensionality through multiple kernels. Our research results demonstrate that the Gaussian Process models are competitive with or better than the well-known Support Vector Machine in terms of classification performance even in the cases of single kernel learning. Extending the basic scheme towards the Multiple Kernel Learning, we improve the efficacy of the Gaussian Process models and their interpretability in terms of the known anatomical correlates of the disease. For instance, the disease pathology starts in and around the hippocampus and entorhinal cortex. Through the use of Gaussian Processes and Multiple Kernel Learning, we have automatically and efficiently determined those portions of neuroimaging data. In addition to their interpretability, our Gaussian Process models are competitive with recent deep learning solutions under similar settings.Comment: The material presented here is to promote the dissemination of scholarly and technical work in a timely fashion. Data in this article are from ADNI (adni.loni.usc.edu). As such, ADNI provided data but did not participate in writing of this repor

arXiv.org e-Print Archive

Fast and Accurate Tumor Segmentation of Histology Images using Persistent Homology and Deep Convolutional Features

Author: Epstein David
Nakane Kazuaki
Qaiser Talha
Rajpoot Nasir
Sakamoto Naoya
Taniyama Daiki
Tsang Yee-Wah
Publication venue
Publication date: 09/05/2018
Field of study

Tumor segmentation in whole-slide images of histology slides is an important step towards computer-assisted diagnosis. In this work, we propose a tumor segmentation framework based on the novel concept of persistent homology profiles (PHPs). For a given image patch, the homology profiles are derived by efficient computation of persistent homology, which is an algebraic tool from homology theory. We propose an efficient way of computing topological persistence of an image, alternative to simplicial homology. The PHPs are devised to distinguish tumor regions from their normal counterparts by modeling the atypical characteristics of tumor nuclei. We propose two variants of our method for tumor segmentation: one that targets speed without compromising accuracy and the other that targets higher accuracy. The fast version is based on the selection of exemplar image patches from a convolution neural network (CNN) and patch classification by quantifying the divergence between the PHPs of exemplars and the input image patch. Detailed comparative evaluation shows that the proposed algorithm is significantly faster than competing algorithms while achieving comparable results. The accurate version combines the PHPs and high-level CNN features and employs a multi-stage ensemble strategy for image patch labeling. Experimental results demonstrate that the combination of PHPs and CNN features outperforms competing algorithms. This study is performed on two independently collected colorectal datasets containing adenoma, adenocarcinoma, signet and healthy cases. Collectively, the accurate tumor segmentation produces the highest average patch-level F1-score, as compared with competing algorithms, on malignant and healthy cases from both the datasets. Overall the proposed framework highlights the utility of persistent homology for histopathology image analysis

arXiv.org e-Print Archive

Joint Point Cloud and Image Based Localization For Efficient Inspection in Mixed Reality

Author: Das Manash Pratim
Dong Zhen
Scherer Sebastian
Publication venue
Publication date: 05/11/2018
Field of study

This paper introduces a method of structure inspection using mixed-reality headsets to reduce the human effort in reporting accurate inspection information such as fault locations in 3D coordinates. Prior to every inspection, the headset needs to be localized. While external pose estimation and fiducial marker based localization would require setup, maintenance, and manual calibration; marker-free self-localization can be achieved using the onboard depth sensor and camera. However, due to limited depth sensor range of portable mixed-reality headsets like Microsoft HoloLens, localization based on simple point cloud registration (sPCR) would require extensive mapping of the environment. Also, localization based on camera image would face the same issues as stereo ambiguities and hence depends on viewpoint. We thus introduce a novel approach to Joint Point Cloud and Image-based Localization (JPIL) for mixed-reality headsets that use visual cues and headset orientation to register small, partially overlapped point clouds and save significant manual labor and time in environment mapping. Our empirical results compared to sPCR show average 10 fold reduction of required overlap surface area that could potentially save on average 20 minutes per inspection. JPIL is not only restricted to inspection tasks but also can be essential in enabling intuitive human-robot interaction for spatial mapping and scene understanding in conjunction with other agents like autonomous robotic systems that are increasingly being deployed in outdoor environments for applications like structural inspection

arXiv.org e-Print Archive

Statistical inference for template-based protein structure prediction

Author: Peng Jian
Publication venue
Publication date: 19/06/2013
Field of study

Protein structure prediction is one of the most important problems in computational biology. The most successful computational approach, also called template-based modeling, identifies templates with solved crystal structures for the query proteins and constructs three dimensional models based on sequence/structure alignments. Although substantial effort has been made to improve protein sequence alignment, the accuracy of alignments between distantly related proteins is still unsatisfactory. In this thesis, I will introduce a number of statistical machine learning methods to build accurate alignments between a protein sequence and its template structures, especially for proteins having only distantly related templates. For a protein with only one good template, we develop a regression-tree based Conditional Random Fields (CRF) model for pairwise protein sequence/structure alignment. By learning a nonlinear threading scoring function, we are able to leverage the correlation among different sequence and structural features. We also introduce an information-theoretic measure to guide the learning algorithm to better exploit the structural features for low-homology proteins with little evolutionary information in their sequence profile. For a protein with multiple good templates, we design a probabilistic consistency approach to thread the protein to all templates simultaneously. By minimizing the discordance between the pairwise alignments of the protein and templates, we are able to construct a multiple sequence/structure alignment, which leads to better structure predictions than any single-template based prediction

arXiv.org e-Print Archive