181 research outputs found

    Median topographic maps for biomedical data sets

    Full text link
    Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis

    Stability comparison of dimensionality reduction techniques attending to data and parameter variations

    Get PDF
    The analysis of the big volumes of data requires efficient and robust dimension reduction techniques to represent data into lower-dimensional spaces, which ease human understanding. This paper presents a study of the stability, robustness and performance of some of these dimension reduction algorithms with respect to algorithm and data parameters, which usually have a major influence in the resulting embeddings. This analysis includes the performance of a large panel of techniques on both artificial and real datasets, focusing on the geometrical variations experimented when changing different parameters. The results are presented by identifying the visual weaknesses of each technique, providing some suitable data-processing tasks to enhance the stabilit

    Reviewing, indicating, and counting books for modern research evaluation systems

    Get PDF
    In this chapter, we focus on the specialists who have helped to improve the conditions for book assessments in research evaluation exercises, with empirically based data and insights supporting their greater integration. Our review highlights the research carried out by four types of expert communities, referred to as the monitors, the subject classifiers, the indexers and the indicator constructionists. Many challenges lie ahead for scholars affiliated with these communities, particularly the latter three. By acknowledging their unique, yet interrelated roles, we show where the greatest potential is for both quantitative and qualitative indicator advancements in book-inclusive evaluation systems.Comment: Forthcoming in Glanzel, W., Moed, H.F., Schmoch U., Thelwall, M. (2018). Springer Handbook of Science and Technology Indicators. Springer Some corrections made in subsection 'Publisher prestige or quality

    Mutual information for the selection of relevant variables in spectrometric nonlinear modelling

    Get PDF
    Data from spectrophotometers form vectors of a large number of exploitable variables. Building quantitative models using these variables most often requires using a smaller set of variables than the initial one. Indeed, a too large number of input variables to a model results in a too large number of parameters, leading to overfitting and poor generalization abilities. In this paper, we suggest the use of the mutual information measure to select variables from the initial set. The mutual information measures the information content in input variables with respect to the model output, without making any assumption on the model that will be used; it is thus suitable for nonlinear modelling. In addition, it leads to the selection of variables among the initial set, and not to linear or nonlinear combinations of them. Without decreasing the model performances compared to other variable projection methods, it allows therefore a greater interpretability of the results
    • …