176 research outputs found
Perplexity-free Parametric t-SNE
The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm is a
ubiquitously employed dimensionality reduction (DR) method. Its non-parametric
nature and impressive efficacy motivated its parametric extension. It is
however bounded to a user-defined perplexity parameter, restricting its DR
quality compared to recently developed multi-scale perplexity-free approaches.
This paper hence proposes a multi-scale parametric t-SNE scheme, relieved from
the perplexity tuning and with a deep neural network implementing the mapping.
It produces reliable embeddings with out-of-sample extensions, competitive with
the best perplexity adjustments in terms of neighborhood preservation on
multiple data sets.Comment: ESANN 2020 proceedings, European Symposium on Artificial Neural
Networks, Computational Intelligence and Machine Learning. Online event, 2-4
October 2020, i6doc.com publ., ISBN 978-2-87587-074-2. Available from
http://www.i6doc.com/en
Optimizing graph layout by t-SNE perplexity estimation
AbstractPerplexity is one of the key parameters of dimensionality reduction algorithm of t-distributed stochastic neighbor embedding (t-SNE). In this paper, we investigated the relationship of t-SNE perplexity and graph layout evaluation metrics including graph stress, preserved neighborhood information and visual inspection. As we found that a small perplexity is correlated with a relative higher normalized stress while preserving neighborhood information with a higher precision but less global structure information, we proposed our method to estimate appropriate perplexity either based on a modified standard t-SNE or the sklearn Barnes–Hut TSNE. Experimental results demonstrate effectiveness and ease of use of our approach when tested on a set of benchmark datasets.</jats:p
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Current learning machines have successfully solved hard application problems,
reaching high accuracy and displaying seemingly "intelligent" behavior. Here we
apply recent techniques for explaining decisions of state-of-the-art learning
machines and analyze various tasks from computer vision and arcade games. This
showcases a spectrum of problem-solving behaviors ranging from naive and
short-sighted, to well-informed and strategic. We observe that standard
performance evaluation metrics can be oblivious to distinguishing these diverse
problem solving behaviors. Furthermore, we propose our semi-automated Spectral
Relevance Analysis that provides a practically effective way of characterizing
and validating the behavior of nonlinear learning machines. This helps to
assess whether a learned model indeed delivers reliably for the problem that it
was conceived for. Furthermore, our work intends to add a voice of caution to
the ongoing excitement about machine intelligence and pledges to evaluate and
judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication
Perplexity-free t-SNE and twice Student tt-SNE
In dimensionality reduction and data visualisation, t-SNE has become a popular method. In this paper, we propose two variants to the Gaussian similarities used to characterise the neighbourhoods around each high-dimensional datum in t-SNE. A first alternative is to use t distributions like already used in the low-dimensional embedding space; a variable degree of freedom accounts for the intrinsic dimensionality of data. The second variant relies on compounds of Gaussian neighbourhoods with growing widths, thereby suppressing the need for the user to adjust a single size or perplexity. In both cases, heavy-tailed distributions thus characterise the neighbourhood relationships in the data space. Experiments show that both variants are competitive with t-SNE, at no extra cost
Inhibitor selectivity: profiling and prediction
Less than 1 in 10 drug candidates that enter phase 1 clinical trials actually gets approved for human use. The high failure rate is in part due to unforeseen side effects or toxicity. A better understanding of the role of selectivity and a better insight in the off-target activities of drug candidates could greatly aid in preventing candidates to fail for these reasons. This thesis has tried to address some aspects in this challenging part of drug discovery. The use of activity-based protein profiling as presented in Chapters 2 and 3 in drug discovery and hit-to-lead optimization, and in Chapter 5 and 6 for the interaction profiling of a drug candidate, highlights the versatility and importance of this chemical biology technique. Combined with knowledge derived from biochemical assays, such as that developed in Chapter 4, ABPP can greatly aid the medicinal chemist. The recent surge in popularity of machine learning algorithms, backed by exponential growth of the amount of biological data available, holds great promise for drug discovery. Chapters 7 and 8 showed the applicability of one such algorithm, which was able to quite reliably predict interaction profiles. The challenges in finding, determining and predicting selectivity are far from solved, but, by incrementally expanding our understanding of the binding of small molecules to their (off-)targets, truly selective inhibitors might at some point become a reality or their necessity might be mitigated.Medical Biochemistr
- …