16,811 research outputs found
Developments in the theory of randomized shortest paths with a comparison of graph node distances
There have lately been several suggestions for parametrized distances on a
graph that generalize the shortest path distance and the commute time or
resistance distance. The need for developing such distances has risen from the
observation that the above-mentioned common distances in many situations fail
to take into account the global structure of the graph. In this article, we
develop the theory of one family of graph node distances, known as the
randomized shortest path dissimilarity, which has its foundation in statistical
physics. We show that the randomized shortest path dissimilarity can be easily
computed in closed form for all pairs of nodes of a graph. Moreover, we come up
with a new definition of a distance measure that we call the free energy
distance. The free energy distance can be seen as an upgrade of the randomized
shortest path dissimilarity as it defines a metric, in addition to which it
satisfies the graph-geodetic property. The derivation and computation of the
free energy distance are also straightforward. We then make a comparison
between a set of generalized distances that interpolate between the shortest
path distance and the commute time, or resistance distance. This comparison
focuses on the applicability of the distances in graph node clustering and
classification. The comparison, in general, shows that the parametrized
distances perform well in the tasks. In particular, we see that the results
obtained with the free energy distance are among the best in all the
experiments.Comment: 30 pages, 4 figures, 3 table
Dissimilarity-based representation for radiomics applications
Radiomics is a term which refers to the analysis of the large amount of
quantitative tumor features extracted from medical images to find useful
predictive, diagnostic or prognostic information. Many recent studies have
proved that radiomics can offer a lot of useful information that physicians
cannot extract from the medical images and can be associated with other
information like gene or protein data. However, most of the classification
studies in radiomics report the use of feature selection methods without
identifying the machine learning challenges behind radiomics. In this paper, we
first show that the radiomics problem should be viewed as an high dimensional,
low sample size, multi view learning problem, then we compare different
solutions proposed in multi view learning for classifying radiomics data. Our
experiments, conducted on several real world multi view datasets, show that the
intermediate integration methods work significantly better than filter and
embedded feature selection methods commonly used in radiomics.Comment: conference, 6 pages, 2 figure
- …