10,546 research outputs found
DancingLines: An Analytical Scheme to Depict Cross-Platform Event Popularity
Nowadays, events usually burst and are propagated online through multiple
modern media like social networks and search engines. There exists various
research discussing the event dissemination trends on individual medium, while
few studies focus on event popularity analysis from a cross-platform
perspective. Challenges come from the vast diversity of events and media,
limited access to aligned datasets across different media and a great deal of
noise in the datasets. In this paper, we design DancingLines, an innovative
scheme that captures and quantitatively analyzes event popularity between
pairwise text media. It contains two models: TF-SW, a semantic-aware popularity
quantification model, based on an integrated weight coefficient leveraging
Word2Vec and TextRank; and wDTW-CD, a pairwise event popularity time series
alignment model matching different event phases adapted from Dynamic Time
Warping. We also propose three metrics to interpret event popularity trends
between pairwise social platforms. Experimental results on eighteen real-world
event datasets from an influential social network and a popular search engine
validate the effectiveness and applicability of our scheme. DancingLines is
demonstrated to possess broad application potentials for discovering the
knowledge of various aspects related to events and different media
k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)
Perhaps the most straightforward classifier in the arsenal or machine
learning techniques is the Nearest Neighbour Classifier -- classification is
achieved by identifying the nearest neighbours to a query example and using
those neighbours to determine the class of the query. This approach to
classification is of particular importance because issues of poor run-time
performance is not such a problem these days with the computational power that
is available. This paper presents an overview of techniques for Nearest
Neighbour classification focusing on; mechanisms for assessing similarity
(distance), computational issues in identifying nearest neighbours and
mechanisms for reducing the dimension of the data.
This paper is the second edition of a paper previously published as a
technical report. Sections on similarity measures for time-series, retrieval
speed-up and intrinsic dimensionality have been added. An Appendix is included
providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN
Elastic Registration of Geodesic Vascular Graphs
Vascular graphs can embed a number of high-level features, from morphological
parameters, to functional biomarkers, and represent an invaluable tool for
longitudinal and cross-sectional clinical inference. This, however, is only
feasible when graphs are co-registered together, allowing coherent multiple
comparisons. The robust registration of vascular topologies stands therefore as
key enabling technology for group-wise analyses. In this work, we present an
end-to-end vascular graph registration approach, that aligns networks with
non-linear geometries and topological deformations, by introducing a novel
overconnected geodesic vascular graph formulation, and without enforcing any
anatomical prior constraint. The 3D elastic graph registration is then
performed with state-of-the-art graph matching methods used in computer vision.
Promising results of vascular matching are found using graphs from synthetic
and real angiographies. Observations and future designs are discussed towards
potential clinical applications
A Survey on Metric Learning for Feature Vectors and Structured Data
The need for appropriate ways to measure the distance or similarity between
data is ubiquitous in machine learning, pattern recognition and data mining,
but handcrafting such good metrics for specific problems is generally
difficult. This has led to the emergence of metric learning, which aims at
automatically learning a metric from data and has attracted a lot of interest
in machine learning and related fields for the past ten years. This survey
paper proposes a systematic review of the metric learning literature,
highlighting the pros and cons of each approach. We pay particular attention to
Mahalanobis distance metric learning, a well-studied and successful framework,
but additionally present a wide range of methods that have recently emerged as
powerful alternatives, including nonlinear metric learning, similarity learning
and local metric learning. Recent trends and extensions, such as
semi-supervised metric learning, metric learning for histogram data and the
derivation of generalization guarantees, are also covered. Finally, this survey
addresses metric learning for structured data, in particular edit distance
learning, and attempts to give an overview of the remaining challenges in
metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved
presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new
method
- …