1,055 research outputs found
Embedding Graphs under Centrality Constraints for Network Visualization
Visual rendering of graphs is a key task in the mapping of complex network
data. Although most graph drawing algorithms emphasize aesthetic appeal,
certain applications such as travel-time maps place more importance on
visualization of structural network properties. The present paper advocates two
graph embedding approaches with centrality considerations to comply with node
hierarchy. The problem is formulated first as one of constrained
multi-dimensional scaling (MDS), and it is solved via block coordinate descent
iterations with successive approximations and guaranteed convergence to a KKT
point. In addition, a regularization term enforcing graph smoothness is
incorporated with the goal of reducing edge crossings. A second approach
leverages the locally-linear embedding (LLE) algorithm which assumes that the
graph encodes data sampled from a low-dimensional manifold. Closed-form
solutions to the resulting centrality-constrained optimization problems are
determined yielding meaningful embeddings. Experimental results demonstrate the
efficacy of both approaches, especially for visualizing large networks on the
order of thousands of nodes.Comment: Submitted to IEEE Transactions on Visualization and Computer Graphic
CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information
Open Information Extraction (OpenIE) methods extract (noun phrase, relation
phrase, noun phrase) triples from text, resulting in the construction of large
Open Knowledge Bases (Open KBs). The noun phrases (NPs) and relation phrases in
such Open KBs are not canonicalized, leading to the storage of redundant and
ambiguous facts. Recent research has posed canonicalization of Open KBs as
clustering over manuallydefined feature spaces. Manual feature engineering is
expensive and often sub-optimal. In order to overcome this challenge, we
propose Canonicalization using Embeddings and Side Information (CESI) - a novel
approach which performs canonicalization over learned embeddings of Open KBs.
CESI extends recent advances in KB embedding by incorporating relevant NP and
relation phrase side information in a principled manner. Through extensive
experiments on multiple real-world datasets, we demonstrate CESI's
effectiveness.Comment: Accepted at WWW 201
Continuous Multiclass Labeling Approaches and Algorithms
We study convex relaxations of the image labeling problem on a continuous
domain with regularizers based on metric interaction potentials. The generic
framework ensures existence of minimizers and covers a wide range of
relaxations of the originally combinatorial problem. We focus on two specific
relaxations that differ in flexibility and simplicity -- one can be used to
tightly relax any metric interaction potential, while the other one only covers
Euclidean metrics but requires less computational effort. For solving the
nonsmooth discretized problem, we propose a globally convergent
Douglas-Rachford scheme, and show that a sequence of dual iterates can be
recovered in order to provide a posteriori optimality bounds. In a quantitative
comparison to two other first-order methods, the approach shows competitive
performance on synthetical and real-world images. By combining the method with
an improved binarization technique for nonstandard potentials, we were able to
routinely recover discrete solutions within 1%--5% of the global optimum for
the combinatorial image labeling problem
Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re-identification
Person re-identification (re-id) aims to match pedestrians observed by
disjoint camera views. It attracts increasing attention in computer vision due
to its importance to surveillance system. To combat the major challenge of
cross-view visual variations, deep embedding approaches are proposed by
learning a compact feature space from images such that the Euclidean distances
correspond to their cross-view similarity metric. However, the global Euclidean
distance cannot faithfully characterize the ideal similarity in a complex
visual feature space because features of pedestrian images exhibit unknown
distributions due to large variations in poses, illumination and occlusion.
Moreover, intra-personal training samples within a local range are robust to
guide deep embedding against uncontrolled variations, which however, cannot be
captured by a global Euclidean distance. In this paper, we study the problem of
person re-id by proposing a novel sampling to mine suitable \textit{positives}
(i.e. intra-class) within a local range to improve the deep embedding in the
context of large intra-class variations. Our method is capable of learning a
deep similarity metric adaptive to local sample structure by minimizing each
sample's local distances while propagating through the relationship between
samples to attain the whole intra-class minimization. To this end, a novel
objective function is proposed to jointly optimize similarity metric learning,
local positive mining and robust deep embedding. This yields local
discriminations by selecting local-ranged positive samples, and the learned
features are robust to dramatic intra-class variations. Experiments on
benchmarks show state-of-the-art results achieved by our method.Comment: Published on Pattern Recognitio
NETEMBED: A Network Resource Mapping Service for Distributed Applications
Emerging configurable infrastructures such as large-scale overlays and grids, distributed testbeds, and sensor networks comprise diverse sets of available computing resources (e.g., CPU and OS capabilities and memory constraints) and network conditions (e.g., link delay, bandwidth, loss rate, and jitter) whose characteristics are both complex and time-varying. At the same time, distributed applications to be deployed on these infrastructures exhibit increasingly complex constraints and requirements on resources they wish to utilize. Examples include selecting nodes and links to schedule an overlay multicast file transfer across the Grid, or embedding a network experiment with specific resource constraints in a distributed testbed such as PlanetLab. Thus, a common problem facing the efficient deployment of distributed applications on these infrastructures is that of "mapping" application-level requirements onto the network in such a manner that the requirements of the application are realized, assuming that the underlying characteristics of the network are known. We refer to this problem as the network embedding problem. In this paper, we propose a new approach to tackle this combinatorially-hard problem. Thanks to a number of heuristics, our approach greatly improves performance and scalability over previously existing techniques. It does so by pruning large portions of the search space without overlooking any valid embedding. We present a construction that allows a compact representation of candidate embeddings, which is maintained by carefully controlling the order via which candidate mappings are inserted and invalid mappings are removed. We present an implementation of our proposed technique, which we call NETEMBED – a service that identify feasible mappings of a virtual network configuration (the query network) to an existing real infrastructure or testbed (the hosting network). We present results of extensive performance evaluation experiments of NETEMBED using several combinations of real and synthetic network topologies. Our results show that our NETEMBED service is quite effective in identifying one (or all) possible embeddings for quite sizable queries and hosting networks – much larger than what any of the existing techniques or services are able to handle.National Science Foundation (CNS Cybertrust 0524477, NSF CNS NeTS 0520166, NSF CNS ITR 0205294, EIA RI 0202067
On the optimality of the neighbor-joining algorithm
The popular neighbor-joining (NJ) algorithm used in phylogenetics is a greedy
algorithm for finding the balanced minimum evolution (BME) tree associated to a
dissimilarity map. From this point of view, NJ is ``optimal'' when the
algorithm outputs the tree which minimizes the balanced minimum evolution
criterion. We use the fact that the NJ tree topology and the BME tree topology
are determined by polyhedral subdivisions of the spaces of dissimilarity maps
to study the optimality of the neighbor-joining
algorithm. In particular, we investigate and compare the polyhedral
subdivisions for . A key requirement is the measurement of volumes of
spherical polytopes in high dimension, which we obtain using a combination of
Monte Carlo methods and polyhedral algorithms. We show that highly unrelated
trees can be co-optimal in BME reconstruction, and that NJ regions are not
convex. We obtain the radius for neighbor-joining for and we
conjecture that the ability of the neighbor-joining algorithm to recover the
BME tree depends on the diameter of the BME tree
- …