44,013 research outputs found

    Beyond subjective and objective in statistics

    Full text link
    We argue that the words "objectivity" and "subjectivity" in statistics discourse are used in a mostly unhelpful way, and we propose to replace each of them with broader collections of attributes, with objectivity replaced by transparency, consensus, impartiality, and correspondence to observable reality, and subjectivity replaced by awareness of multiple perspectives and context dependence. The advantage of these reformulations is that the replacement terms do not oppose each other. Instead of debating over whether a given statistical method is subjective or objective (or normatively debating the relative merits of subjectivity and objectivity in statistical practice), we can recognize desirable attributes such as transparency and acknowledgment of multiple perspectives as complementary goals. We demonstrate the implications of our proposal with recent applied examples from pharmacology, election polling, and socioeconomic stratification.Comment: 35 page

    Clustering methods based on variational analysis in the space of measures

    Get PDF
    We formulate clustering as a minimisation problem in the space of measures by modelling the cluster centres as a Poisson process with unknown intensity function.We derive a Ward-type clustering criterion which, under the Poisson assumption, can easily be evaluated explicitly in terms of the intensity function. We show that asymptotically, i.e. for increasing total intensity, the optimal intensity function is proportional to a dimension-dependent power of the density of the observations. For fixed finite total intensity, no explicit solution seems available. However, the Ward-type criterion to be minimised is convex in the intensity function, so that the steepest descent method of Molchanov and Zuyev (2001) can be used to approximate the global minimum. It turns out that the gradient is similar in form to the functional to be optimised. If we discretise over a grid, the steepest descent algorithm at each iteration step increases the current intensity function at those points where the gradient is minimal at the expense of regions with a large gradient value. The algorithm is applied to a toy one-dimensional example, a simulation from a popular spatial cluster model and a real-life dataset from Strauss (1975) concerning the positions of redwood seedlings. Finally, we discuss the relative merits of our approach compared to classical hierarchical and partition clustering techniques as well as to modern model based clustering methods using Markov point processes and mixture distributions

    Typical Phone Use Habits: Intense Use Does Not Predict Negative Well-Being

    Full text link
    Not all smartphone owners use their device in the same way. In this work, we uncover broad, latent patterns of mobile phone use behavior. We conducted a study where, via a dedicated logging app, we collected daily mobile phone activity data from a sample of 340 participants for a period of four weeks. Through an unsupervised learning approach and a methodologically rigorous analysis, we reveal five generic phone use profiles which describe at least 10% of the participants each: limited use, business use, power use, and personality- & externally induced problematic use. We provide evidence that intense mobile phone use alone does not predict negative well-being. Instead, our approach automatically revealed two groups with tendencies for lower well-being, which are characterized by nightly phone use sessions.Comment: 10 pages, 6 figures, conference pape

    From efficacy to equity: Literature review of decision criteria for resource allocation and healthcare decisionmaking

    Get PDF
    Objectives Resource allocation is a challenging issue faced by health policy decisionmakers requiring careful consideration of many factors. Objectives of this study were to identify decision criteria and their frequency reported in the literature on healthcare decisionmaking. Method An extensive literature search was performed in Medline and EMBASE to identify articles reporting healthcare decision criteria. Studies conducted with decisionmakers (e.g., focus groups, surveys, interviews), conceptual and review articles and articles describing multicriteria tools were included. Criteria were extracted, organized using a classification system derived from the EVIDEM framework and applying multicriteria decision analysis (MCDA) principles, and the frequency of their occurrence was measured. Results Out of 3146 records identified, 2790 were excluded. Out of 356 articles assessed for eligibility, 40 studies included. Criteria were identified from studies performed in several regions of the world involving decisionmakers at micro, meso and macro levels of decision and from studies reporting on multicriteria tools. Large variations in terminology used to define criteria were observed and 360 different terms were identified. These were assigned to 58 criteria which were classified in 9 different categories including: health outcomes; types of benefit; disease impact; therapeutic context; economic impact; quality of evidence; implementation complexity; priority, fairness and ethics; and overall context. The most frequently mentioned criteria were: equity/fairness (32 times), efficacy/effectiveness (29), stakeholder interests and pressures (28), cost-effectiveness (23), strength of evidence (20), safety (19), mission and mandate of health system (19), organizational requirements and capacity (17), patient-reported outcomes (17) and need (16). Conclusion This study highlights the importance of considering both normative and feasibility criteria for fair allocation of resources and optimized decisionmaking for coverage and use of healthcare interventions. This analysis provides a foundation to develop a questionnaire for an international survey of decisionmakers on criteria and their relative importance. The ultimate objective is to develop sound multicriteria approaches to enlighten healthcare decisionmaking and priority-settin

    SMART: Unique splitting-while-merging framework for gene clustering

    Get PDF
    Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc

    Cluster validity in clustering methods

    Get PDF

    Multivariate Approaches to Classification in Extragalactic Astronomy

    Get PDF
    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono-or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.Comment: Open Access paper. http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>. \<10.3389/fspas.2015.00003 \&g
    • …
    corecore