Search CORE

38,770 research outputs found

Multi-Source Spatial Entity Linkage

Author: Isaj Suela
Pedersen Torben Bach
Zimányi Esteban
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/08/2019
Field of study

Besides the traditional cartographic data sources, spatial information can also be derived from location-based sources. However, even though different location-based sources refer to the same physical world, each one has only partial coverage of the spatial entities, describe them with different attributes, and sometimes provide contradicting information. Hence, we introduce the spatial entity linkage problem, which finds which pairs of spatial entities belong to the same physical spatial entity. Our proposed solution (QuadSky) starts with a time-efficient spatial blocking technique (QuadFlex), compares pairwise the spatial entities in the same block, ranks the pairs using Pareto optimality with the SkyRank algorithm, and finally, classifies the pairs with our novel SkyEx-* family of algorithms that yield 0.85 precision and 0.85 recall for a manually labeled dataset of 1,500 pairs and 0.87 precision and 0.6 recall for a semi-manually labeled dataset of 777,452 pairs. Moreover, we provide a theoretical guarantee and formalize the SkyEx-FES algorithm that explores only 27% of the skylines without any loss in F-measure. Furthermore, our fully unsupervised algorithm SkyEx-D approximates the optimal result with an F-measure loss of just 0.01. Finally, QuadSky provides the best trade-off between precision and recall, and the best F-measure compared to the existing baselines and clustering techniques, and approximates the results of supervised learning solutions

arXiv.org e-Print Archive

Crossref

VBN

Recommended from our members

Interactive Segmentation in Multimodal Medical Imagery Using a Bayesian Transductive Learning Approach

Author: Caban Jesus
Ebadollahi Shahram
Laine Andrew F.
Lee Noah
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Labeled training data in the medical domain is rare and expensive to obtain. The lack of labeled multimodal medical image data is a major obstacle for devising learning-based interactive segmentation tools. Transductive learning (TL) or semi-supervised learning (SSL) offers a workaround by leveraging unlabeled and labeled data to infer labels for the test set given a small portion of label information. In this paper we propose a novel algorithm for interactive segmentation using transductive learning and inference in conditional mixture nave Bayes models (T-CMNB) with spatial regularization constraints. T-CMNB is an extension of the transductive nave Bayes algorithm [1, 20]. The multimodal Gaussian mixture assumption on the class-conditional likelihood and spatial regularization constraints allow us to explain more complex distributions required for spatial classification in multimodal imagery. To simplify the estimation we reduce the parameter space by assuming nave conditional independence between the feature space and the class label. The nave conditional independence assumption allows efficient inference of marginal and conditional distributions for large scale learning and inference [19]. We evaluate the proposed algorithm on multimodal MRI brain imagery using ROC statistics and provide preliminary results. The algorithm shows promising segmentation performance with a sensitivity and specificity of 90.37% and 99.74% respectively and compares competitively to alternative interactive segmentation schemes

Columbia University Academic Commons

Exhaustive and Efficient Constraint Propagation: A Semi-Supervised Learning Perspective and Its Applications

Author: A Ng
A Oliva
B Ghanem
C Carson
C Snoek
D Blei
D Zhou
E Bruno
G Chen
H Hotelling
J Li
J Shi
L Hubert
N Rasiwasia
P Lancaster
R Bartels
S Yu
U Luxburg von
V Ordonez
Yuxin Peng
Z Lu
Zhiwu Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/09/2011
Field of study

This paper presents a novel pairwise constraint propagation approach by decomposing the challenging constraint propagation problem into a set of independent semi-supervised learning subproblems which can be solved in quadratic time using label propagation based on k-nearest neighbor graphs. Considering that this time cost is proportional to the number of all possible pairwise constraints, our approach actually provides an efficient solution for exhaustively propagating pairwise constraints throughout the entire dataset. The resulting exhaustive set of propagated pairwise constraints are further used to adjust the similarity matrix for constrained spectral clustering. Other than the traditional constraint propagation on single-source data, our approach is also extended to more challenging constraint propagation on multi-source data where each pairwise constraint is defined over a pair of data points from different sources. This multi-source constraint propagation has an important application to cross-modal multimedia retrieval. Extensive results have shown the superior performance of our approach.Comment: The short version of this paper appears as oral paper in ECCV 201

arXiv.org e-Print Archive

Crossref

Semi-Supervised Sparse Coding

Author: Gao Xin
Wang Jim Jing-Yan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/01/2015
Field of study

Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a semi-supervised manner, where only a few training samples are labeled. By using the manifold structure spanned by the data set of both labeled and unlabeled samples and the constraints provided by the labels of the labeled samples, we learn the variable class labels for all the samples. Furthermore, to improve the discriminative ability of the learned sparse codes, we assume that the class labels could be predicted from the sparse codes directly using a linear classifier. By solving the codebook, sparse codes, class labels and classifier parameters simultaneously in a unified objective function, we develop a semi-supervised sparse coding algorithm. Experiments on two real-world pattern recognition problems demonstrate the advantage of the proposed methods over supervised sparse coding methods on partially labeled data sets

arXiv.org e-Print Archive

Crossref