181 research outputs found
Median topographic maps for biomedical data sets
Median clustering extends popular neural data analysis methods such as the
self-organizing map or neural gas to general data structures given by a
dissimilarity matrix only. This offers flexible and robust global data
inspection methods which are particularly suited for a variety of data as
occurs in biomedical domains. In this chapter, we give an overview about median
clustering and its properties and extensions, with a particular focus on
efficient implementations adapted to large scale data analysis
Stability comparison of dimensionality reduction techniques attending to data and parameter variations
The analysis of the big volumes of data requires efficient and robust dimension reduction techniques to represent data into lower-dimensional spaces, which ease human understanding. This paper presents a study of the stability, robustness and performance of some of these dimension reduction algorithms with respect to algorithm and data parameters, which usually have a major influence in the resulting embeddings. This analysis includes the performance of a large panel of techniques on both artificial and real datasets, focusing on the geometrical variations experimented when changing different parameters. The results are presented by identifying the visual weaknesses of each technique, providing some suitable data-processing tasks to enhance the stabilit
Reviewing, indicating, and counting books for modern research evaluation systems
In this chapter, we focus on the specialists who have helped to improve the
conditions for book assessments in research evaluation exercises, with
empirically based data and insights supporting their greater integration. Our
review highlights the research carried out by four types of expert communities,
referred to as the monitors, the subject classifiers, the indexers and the
indicator constructionists. Many challenges lie ahead for scholars affiliated
with these communities, particularly the latter three. By acknowledging their
unique, yet interrelated roles, we show where the greatest potential is for
both quantitative and qualitative indicator advancements in book-inclusive
evaluation systems.Comment: Forthcoming in Glanzel, W., Moed, H.F., Schmoch U., Thelwall, M.
(2018). Springer Handbook of Science and Technology Indicators. Springer Some
corrections made in subsection 'Publisher prestige or quality
Mutual information for the selection of relevant variables in spectrometric nonlinear modelling
Data from spectrophotometers form vectors of a large number of exploitable
variables. Building quantitative models using these variables most often
requires using a smaller set of variables than the initial one. Indeed, a too
large number of input variables to a model results in a too large number of
parameters, leading to overfitting and poor generalization abilities. In this
paper, we suggest the use of the mutual information measure to select variables
from the initial set. The mutual information measures the information content
in input variables with respect to the model output, without making any
assumption on the model that will be used; it is thus suitable for nonlinear
modelling. In addition, it leads to the selection of variables among the
initial set, and not to linear or nonlinear combinations of them. Without
decreasing the model performances compared to other variable projection
methods, it allows therefore a greater interpretability of the results
Recommended from our members
The Flow of Trust: A Visualization Framework to Externalize, Explore, and Explain Trust in ML Applications
We present a conceptual framework for the development of visual interactive techniques to formalize and externalize trust in machine learning (ML) workflows. Currently, trust in ML applications is an implicit process that takes place in the user's mind. As such, there is no method of feedback or communication of trust that can be acted upon. Our framework will be instrumental in developing interactive visualization approaches that will help users to efficiently and effectively build and communicate trust in ways that fit each of the ML process stages. We formulate several research questions and directions that include: 1) a typology/taxonomy of trust objects, trust issues, and possible reasons for (mis)trust; 2) formalisms to represent trust in machine-readable form; 3) means by which users can express their state of trust by interacting with a computer system (e.g., text, drawing, marking); 4) ways in which a system can facilitate users' expression and communication of the state of trust; and 5) creation of visual interactive techniques for representation and exploration of trust over all stages of an ML pipeline
- …