264 research outputs found
Stability comparison of dimensionality reduction techniques attending to data and parameter variations
The analysis of the big volumes of data requires efficient and robust dimension reduction techniques to represent data into lower-dimensional spaces, which ease human understanding. This paper presents a study of the stability, robustness and performance of some of these dimension reduction algorithms with respect to algorithm and data parameters, which usually have a major influence in the resulting embeddings. This analysis includes the performance of a large panel of techniques on both artificial and real datasets, focusing on the geometrical variations experimented when changing different parameters. The results are presented by identifying the visual weaknesses of each technique, providing some suitable data-processing tasks to enhance the stabilit
Reviewing, indicating, and counting books for modern research evaluation systems
In this chapter, we focus on the specialists who have helped to improve the
conditions for book assessments in research evaluation exercises, with
empirically based data and insights supporting their greater integration. Our
review highlights the research carried out by four types of expert communities,
referred to as the monitors, the subject classifiers, the indexers and the
indicator constructionists. Many challenges lie ahead for scholars affiliated
with these communities, particularly the latter three. By acknowledging their
unique, yet interrelated roles, we show where the greatest potential is for
both quantitative and qualitative indicator advancements in book-inclusive
evaluation systems.Comment: Forthcoming in Glanzel, W., Moed, H.F., Schmoch U., Thelwall, M.
(2018). Springer Handbook of Science and Technology Indicators. Springer Some
corrections made in subsection 'Publisher prestige or quality
The effect of niobium on austenite evolution during hot rolling of advanced high strength steel
Perplexity-free Parametric t-SNE
The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm is a
ubiquitously employed dimensionality reduction (DR) method. Its non-parametric
nature and impressive efficacy motivated its parametric extension. It is
however bounded to a user-defined perplexity parameter, restricting its DR
quality compared to recently developed multi-scale perplexity-free approaches.
This paper hence proposes a multi-scale parametric t-SNE scheme, relieved from
the perplexity tuning and with a deep neural network implementing the mapping.
It produces reliable embeddings with out-of-sample extensions, competitive with
the best perplexity adjustments in terms of neighborhood preservation on
multiple data sets.Comment: ESANN 2020 proceedings, European Symposium on Artificial Neural
Networks, Computational Intelligence and Machine Learning. Online event, 2-4
October 2020, i6doc.com publ., ISBN 978-2-87587-074-2. Available from
http://www.i6doc.com/en
Median topographic maps for biomedical data sets
Median clustering extends popular neural data analysis methods such as the
self-organizing map or neural gas to general data structures given by a
dissimilarity matrix only. This offers flexible and robust global data
inspection methods which are particularly suited for a variety of data as
occurs in biomedical domains. In this chapter, we give an overview about median
clustering and its properties and extensions, with a particular focus on
efficient implementations adapted to large scale data analysis
Mutual information for the selection of relevant variables in spectrometric nonlinear modelling
Data from spectrophotometers form vectors of a large number of exploitable
variables. Building quantitative models using these variables most often
requires using a smaller set of variables than the initial one. Indeed, a too
large number of input variables to a model results in a too large number of
parameters, leading to overfitting and poor generalization abilities. In this
paper, we suggest the use of the mutual information measure to select variables
from the initial set. The mutual information measures the information content
in input variables with respect to the model output, without making any
assumption on the model that will be used; it is thus suitable for nonlinear
modelling. In addition, it leads to the selection of variables among the
initial set, and not to linear or nonlinear combinations of them. Without
decreasing the model performances compared to other variable projection
methods, it allows therefore a greater interpretability of the results
Recommended from our members
The Flow of Trust: A Visualization Framework to Externalize, Explore, and Explain Trust in ML Applications
We present a conceptual framework for the development of visual interactive techniques to formalize and externalize trust in machine learning (ML) workflows. Currently, trust in ML applications is an implicit process that takes place in the user's mind. As such, there is no method of feedback or communication of trust that can be acted upon. Our framework will be instrumental in developing interactive visualization approaches that will help users to efficiently and effectively build and communicate trust in ways that fit each of the ML process stages. We formulate several research questions and directions that include: 1) a typology/taxonomy of trust objects, trust issues, and possible reasons for (mis)trust; 2) formalisms to represent trust in machine-readable form; 3) means by which users can express their state of trust by interacting with a computer system (e.g., text, drawing, marking); 4) ways in which a system can facilitate users' expression and communication of the state of trust; and 5) creation of visual interactive techniques for representation and exploration of trust over all stages of an ML pipeline
Preculture sugarcane tissue in sucrose-supplemented culture medium to induce desiccation tolerance
All downhill from the PhD? The typical impact trajectory of US academic careers
© 2020 The Authors. Published by MIT Press. This is an open access article available under a Creative Commons licence.
The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1162/qss_a_00072.Within academia, mature researchers tend to be more senior, but do they also tend to write higher impact articles? This article assesses long-term publishing (16+ years) United States (US) researchers, contrasting them with shorter-term publishing researchers (1, 6 or 10 years). A long-term US researcher is operationalised as having a first Scopus-indexed journal article in exactly 2001 and one in 2016-2019, with US main affiliations in their first and last articles. Researchers publishing in large teams (11+ authors) were excluded. The average field and year normalised citation impact of long- and shorter-term US researchers’ journal articles decreases over time relative to the national average, with especially large falls to the last articles published that may be at least partly due to a decline in self-citations. In many cases researchers start by publishing above US average citation impact research and end by publishing below US average citation impact research. Thus, research managers should not assume that senior researchers will usually write the highest impact papers
Voraussetzungen für die Beurteilung der Qualität geisteswissenschaftlicher Forschung: Zusammenführung der Befunde aus vier empirischen Studien
- …
