7,312 research outputs found

    The Importance of Being Clustered: Uncluttering the Trends of Statistics from 1970 to 2015

    Full text link
    In this paper we retrace the recent history of statistics by analyzing all the papers published in five prestigious statistical journals since 1970, namely: Annals of Statistics, Biometrika, Journal of the American Statistical Association, Journal of the Royal Statistical Society, series B and Statistical Science. The aim is to construct a kind of "taxonomy" of the statistical papers by organizing and by clustering them in main themes. In this sense being identified in a cluster means being important enough to be uncluttered in the vast and interconnected world of the statistical research. Since the main statistical research topics naturally born, evolve or die during time, we will also develop a dynamic clustering strategy, where a group in a time period is allowed to migrate or to merge into different groups in the following one. Results show that statistics is a very dynamic and evolving science, stimulated by the rise of new research questions and types of data

    High-entropy high-hardness metal carbides discovered by entropy descriptors

    Get PDF
    High-entropy materials have attracted considerable interest due to the combination of useful properties and promising applications. Predicting their formation remains the major hindrance to the discovery of new systems. Here we propose a descriptor - entropy forming ability - for addressing synthesizability from first principles. The formalism, based on the energy distribution spectrum of randomized calculations, captures the accessibility of equally-sampled states near the ground state and quantifies configurational disorder capable of stabilizing high-entropy homogeneous phases. The methodology is applied to disordered refractory 5-metal carbides - promising candidates for high-hardness applications. The descriptor correctly predicts the ease with which compositions can be experimentally synthesized as rock-salt high-entropy homogeneous phases, validating the ansatz, and in some cases, going beyond intuition. Several of these materials exhibit hardness up to 50% higher than rule of mixtures estimations. The entropy descriptor method has the potential to accelerate the search for high-entropy systems by rationally combining first principles with experimental synthesis and characterization.Comment: 12 pages, 2 figure

    Cluster validation by measurement of clustering characteristics relevant to the user

    Full text link
    There are many cluster analysis methods that can produce quite different clusterings on the same dataset. Cluster validation is about the evaluation of the quality of a clustering; "relative cluster validation" is about using such criteria to compare clusterings. This can be used to select one of a set of clusterings from different methods, or from the same method ran with different parameters such as different numbers of clusters. There are many cluster validation indexes in the literature. Most of them attempt to measure the overall quality of a clustering by a single number, but this can be inappropriate. There are various different characteristics of a clustering that can be relevant in practice, depending on the aim of clustering, such as low within-cluster distances and high between-cluster separation. In this paper, a number of validation criteria will be introduced that refer to different desirable characteristics of a clustering, and that characterise a clustering in a multidimensional way. In specific applications the user may be interested in some of these criteria rather than others. A focus of the paper is on methodology to standardise the different characteristics so that users can aggregate them in a suitable way specifying weights for the various criteria that are relevant in the clustering application at hand.Comment: 20 pages 2 figure

    Shrinkage estimation of variance components with applications to microarray data

    Get PDF

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page
    corecore