51,242 research outputs found
NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences
Feature correspondence selection is pivotal to many feature-matching based
tasks in computer vision. Searching for spatially k-nearest neighbors is a
common strategy for extracting local information in many previous works.
However, there is no guarantee that the spatially k-nearest neighbors of
correspondences are consistent because the spatial distribution of false
correspondences is often irregular. To address this issue, we present a
compatibility-specific mining method to search for consistent neighbors.
Moreover, in order to extract and aggregate more reliable features from
neighbors, we propose a hierarchical network named NM-Net with a series of
convolution layers taking the generated graph as input, which is insensitive to
the order of correspondences. Our experimental results have shown the proposed
method achieves the state-of-the-art performance on four datasets with various
inlier ratios and varying numbers of feature consistencies.Comment: IEEE Conference on Computer Vision and Pattern Recognition (CVPR
2019) (oral
Video Object Segmentation Without Temporal Information
Video Object Segmentation, and video processing in general, has been
historically dominated by methods that rely on the temporal consistency and
redundancy in consecutive video frames. When the temporal smoothness is
suddenly broken, such as when an object is occluded, or some frames are missing
in a sequence, the result of these methods can deteriorate significantly or
they may not even produce any result at all. This paper explores the orthogonal
approach of processing each frame independently, i.e disregarding the temporal
information. In particular, it tackles the task of semi-supervised video object
segmentation: the separation of an object from the background in a video, given
its mask in the first frame. We present Semantic One-Shot Video Object
Segmentation (OSVOS-S), based on a fully-convolutional neural network
architecture that is able to successively transfer generic semantic
information, learned on ImageNet, to the task of foreground segmentation, and
finally to learning the appearance of a single annotated object of the test
sequence (hence one shot). We show that instance level semantic information,
when combined effectively, can dramatically improve the results of our previous
method, OSVOS. We perform experiments on two recent video segmentation
databases, which show that OSVOS-S is both the fastest and most accurate method
in the state of the art.Comment: Accepted to T-PAMI. Extended version of "One-Shot Video Object
Segmentation", CVPR 2017 (arXiv:1611.05198). Project page:
http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos
Explainable AI for Trees: From Local Explanations to Global Understanding
Tree-based machine learning models such as random forests, decision trees,
and gradient boosted trees are the most popular non-linear predictive models
used in practice today, yet comparatively little attention has been paid to
explaining their predictions. Here we significantly improve the
interpretability of tree-based models through three main contributions: 1) The
first polynomial time algorithm to compute optimal explanations based on game
theory. 2) A new type of explanation that directly measures local feature
interaction effects. 3) A new set of tools for understanding global model
structure based on combining many local explanations of each prediction. We
apply these tools to three medical machine learning problems and show how
combining many high-quality local explanations allows us to represent global
structure while retaining local faithfulness to the original model. These tools
enable us to i) identify high magnitude but low frequency non-linear mortality
risk factors in the general US population, ii) highlight distinct population
sub-groups with shared risk characteristics, iii) identify non-linear
interaction effects among risk factors for chronic kidney disease, and iv)
monitor a machine learning model deployed in a hospital by identifying which
features are degrading the model's performance over time. Given the popularity
of tree-based machine learning models, these improvements to their
interpretability have implications across a broad set of domains
Nasal Patches and Curves for Expression-robust 3D Face Recognition
The potential of the nasal region for expression robust 3D face recognition
is thoroughly investigated by a novel five-step algorithm. First, the nose tip
location is coarsely detected and the face is segmented, aligned and the nasal
region cropped. Then, a very accurate and consistent nasal landmarking
algorithm detects seven keypoints on the nasal region. In the third step, a
feature extraction algorithm based on the surface normals of Gabor-wavelet
filtered depth maps is utilised and, then, a set of spherical patches and
curves are localised over the nasal region to provide the feature descriptors.
The last step applies a genetic algorithm-based feature selector to detect the
most stable patches and curves over different facial expressions. The algorithm
provides the highest reported nasal region-based recognition ranks on the FRGC,
Bosphorus and BU-3DFE datasets. The results are comparable with, and in many
cases better than, many state-of-the-art 3D face recognition algorithms, which
use the whole facial domain. The proposed method does not rely on sophisticated
alignment or denoising steps, is very robust when only one sample per subject
is used in the gallery, and does not require a training step for the
landmarking algorithm. https://github.com/mehryaragha/NoseBiometric
Even Trolls Are Useful: Efficient Link Classification in Signed Networks
We address the problem of classifying the links of signed social networks
given their full structural topology. Motivated by a binary user behaviour
assumption, which is supported by decades of research in psychology, we develop
an efficient and surprisingly simple approach to solve this classification
problem. Our methods operate both within the active and batch settings. We
demonstrate that the algorithms we developed are extremely fast in both
theoretical and practical terms. Within the active setting, we provide a new
complexity measure and a rigorous analysis of our methods that hold for
arbitrary signed networks. We validate our theoretical claims carrying out a
set of experiments on three well known real-world datasets, showing that our
methods outperform the competitors while being much faster.Comment: 17 pages, 3 figure
Recent Advance in Content-based Image Retrieval: A Literature Survey
The explosive increase and ubiquitous accessibility of visual data on the Web
have led to the prosperity of research activity in image search or retrieval.
With the ignorance of visual content as a ranking clue, methods with text
search techniques for visual retrieval may suffer inconsistency between the
text words and visual content. Content-based image retrieval (CBIR), which
makes use of the representation of visual content to identify relevant images,
has attracted sustained attention in recent two decades. Such a problem is
challenging due to the intention gap and the semantic gap problems. Numerous
techniques have been developed for content-based image retrieval in the last
decade. The purpose of this paper is to categorize and evaluate those
algorithms proposed during the period of 2003 to 2016. We conclude with several
promising directions for future research.Comment: 22 page
Multiple Kernel Learning and Automatic Subspace Relevance Determination for High-dimensional Neuroimaging Data
Alzheimer's disease is a major cause of dementia. Its diagnosis requires
accurate biomarkers that are sensitive to disease stages. In this respect, we
regard probabilistic classification as a method of designing a probabilistic
biomarker for disease staging. Probabilistic biomarkers naturally support the
interpretation of decisions and evaluation of uncertainty associated with them.
In this paper, we obtain probabilistic biomarkers via Gaussian Processes.
Gaussian Processes enable probabilistic kernel machines that offer flexible
means to accomplish Multiple Kernel Learning. Exploiting this flexibility, we
propose a new variation of Automatic Relevance Determination and tackle the
challenges of high dimensionality through multiple kernels. Our research
results demonstrate that the Gaussian Process models are competitive with or
better than the well-known Support Vector Machine in terms of classification
performance even in the cases of single kernel learning. Extending the basic
scheme towards the Multiple Kernel Learning, we improve the efficacy of the
Gaussian Process models and their interpretability in terms of the known
anatomical correlates of the disease. For instance, the disease pathology
starts in and around the hippocampus and entorhinal cortex. Through the use of
Gaussian Processes and Multiple Kernel Learning, we have automatically and
efficiently determined those portions of neuroimaging data. In addition to
their interpretability, our Gaussian Process models are competitive with recent
deep learning solutions under similar settings.Comment: The material presented here is to promote the dissemination of
scholarly and technical work in a timely fashion. Data in this article are
from ADNI (adni.loni.usc.edu). As such, ADNI provided data but did not
participate in writing of this repor
Fast and Accurate Tumor Segmentation of Histology Images using Persistent Homology and Deep Convolutional Features
Tumor segmentation in whole-slide images of histology slides is an important
step towards computer-assisted diagnosis. In this work, we propose a tumor
segmentation framework based on the novel concept of persistent homology
profiles (PHPs). For a given image patch, the homology profiles are derived by
efficient computation of persistent homology, which is an algebraic tool from
homology theory. We propose an efficient way of computing topological
persistence of an image, alternative to simplicial homology. The PHPs are
devised to distinguish tumor regions from their normal counterparts by modeling
the atypical characteristics of tumor nuclei. We propose two variants of our
method for tumor segmentation: one that targets speed without compromising
accuracy and the other that targets higher accuracy. The fast version is based
on the selection of exemplar image patches from a convolution neural network
(CNN) and patch classification by quantifying the divergence between the PHPs
of exemplars and the input image patch. Detailed comparative evaluation shows
that the proposed algorithm is significantly faster than competing algorithms
while achieving comparable results. The accurate version combines the PHPs and
high-level CNN features and employs a multi-stage ensemble strategy for image
patch labeling. Experimental results demonstrate that the combination of PHPs
and CNN features outperforms competing algorithms. This study is performed on
two independently collected colorectal datasets containing adenoma,
adenocarcinoma, signet and healthy cases. Collectively, the accurate tumor
segmentation produces the highest average patch-level F1-score, as compared
with competing algorithms, on malignant and healthy cases from both the
datasets. Overall the proposed framework highlights the utility of persistent
homology for histopathology image analysis
Joint Point Cloud and Image Based Localization For Efficient Inspection in Mixed Reality
This paper introduces a method of structure inspection using mixed-reality
headsets to reduce the human effort in reporting accurate inspection
information such as fault locations in 3D coordinates. Prior to every
inspection, the headset needs to be localized. While external pose estimation
and fiducial marker based localization would require setup, maintenance, and
manual calibration; marker-free self-localization can be achieved using the
onboard depth sensor and camera. However, due to limited depth sensor range of
portable mixed-reality headsets like Microsoft HoloLens, localization based on
simple point cloud registration (sPCR) would require extensive mapping of the
environment. Also, localization based on camera image would face the same
issues as stereo ambiguities and hence depends on viewpoint. We thus introduce
a novel approach to Joint Point Cloud and Image-based Localization (JPIL) for
mixed-reality headsets that use visual cues and headset orientation to register
small, partially overlapped point clouds and save significant manual labor and
time in environment mapping. Our empirical results compared to sPCR show
average 10 fold reduction of required overlap surface area that could
potentially save on average 20 minutes per inspection. JPIL is not only
restricted to inspection tasks but also can be essential in enabling intuitive
human-robot interaction for spatial mapping and scene understanding in
conjunction with other agents like autonomous robotic systems that are
increasingly being deployed in outdoor environments for applications like
structural inspection
Statistical inference for template-based protein structure prediction
Protein structure prediction is one of the most important problems in
computational biology. The most successful computational approach, also called
template-based modeling, identifies templates with solved crystal structures
for the query proteins and constructs three dimensional models based on
sequence/structure alignments. Although substantial effort has been made to
improve protein sequence alignment, the accuracy of alignments between
distantly related proteins is still unsatisfactory. In this thesis, I will
introduce a number of statistical machine learning methods to build accurate
alignments between a protein sequence and its template structures, especially
for proteins having only distantly related templates. For a protein with only
one good template, we develop a regression-tree based Conditional Random Fields
(CRF) model for pairwise protein sequence/structure alignment. By learning a
nonlinear threading scoring function, we are able to leverage the correlation
among different sequence and structural features. We also introduce an
information-theoretic measure to guide the learning algorithm to better exploit
the structural features for low-homology proteins with little evolutionary
information in their sequence profile. For a protein with multiple good
templates, we design a probabilistic consistency approach to thread the protein
to all templates simultaneously. By minimizing the discordance between the
pairwise alignments of the protein and templates, we are able to construct a
multiple sequence/structure alignment, which leads to better structure
predictions than any single-template based prediction
- …