1,877 research outputs found
A machine learning pipeline for supporting differentiation of glioblastomas from single brain metastases
Machine learning has provided, over the last decades, tools for knowledge extraction in complex medical domains. Most of these tools, though, are ad hoc solutions and lack the systematic approach that would be required to become mainstream in medical practice. In this brief paper, we define a machine learning-based analysis pipeline for helping in a difficult problem in the field of neuro-oncology, namely the discrimination of brain glioblastomas from single brain metastases. This pipeline involves source extraction using k-Meansinitialized Convex Non-negative Matrix Factorization and a collection of classifiers, including Logistic Regression, Linear Discriminant Analysis, AdaBoost, and Random Forests.Peer ReviewedPostprint (published version
Total Jensen divergences: Definition, Properties and k-Means++ Clustering
We present a novel class of divergences induced by a smooth convex function
called total Jensen divergences. Those total Jensen divergences are invariant
by construction to rotations, a feature yielding regularization of ordinary
Jensen divergences by a conformal factor. We analyze the relationships between
this novel class of total Jensen divergences and the recently introduced total
Bregman divergences. We then proceed by defining the total Jensen centroids as
average distortion minimizers, and study their robustness performance to
outliers. Finally, we prove that the k-means++ initialization that bypasses
explicit centroid computations is good enough in practice to guarantee
probabilistically a constant approximation factor to the optimal k-means
clustering.Comment: 27 page
Faster K-Means Cluster Estimation
There has been considerable work on improving popular clustering algorithm
`K-means' in terms of mean squared error (MSE) and speed, both. However, most
of the k-means variants tend to compute distance of each data point to each
cluster centroid for every iteration. We propose a fast heuristic to overcome
this bottleneck with only marginal increase in MSE. We observe that across all
iterations of K-means, a data point changes its membership only among a small
subset of clusters. Our heuristic predicts such clusters for each data point by
looking at nearby clusters after the first iteration of k-means. We augment
well known variants of k-means with our heuristic to demonstrate effectiveness
of our heuristic. For various synthetic and real-world datasets, our heuristic
achieves speed-up of up-to 3 times when compared to efficient variants of
k-means.Comment: 6 pages, Accepted at ECIR 201
- …