8,848 research outputs found
What are the true clusters?
Constructivist philosophy and Hasok Chang's active scientific realism are
used to argue that the idea of "truth" in cluster analysis depends on the
context and the clustering aims. Different characteristics of clusterings are
required in different situations. Researchers should be explicit about on what
requirements and what idea of "true clusters" their research is based, because
clustering becomes scientific not through uniqueness but through transparent
and open communication. The idea of "natural kinds" is a human construct, but
it highlights the human experience that the reality outside the observer's
control seems to make certain distinctions between categories inevitable.
Various desirable characteristics of clusterings and various approaches to
define a context-dependent truth are listed, and I discuss what impact these
ideas can have on the comparison of clustering methods, and the choice of a
clustering methods and related decisions in practice
Strong validity, consonance, and conformal prediction
Valid prediction of future observations is an important and challenging
problem. The two general approaches for quantifying uncertainty about the
future value employ prediction regions and predictive distribution,
respectively, with the latter usually considered to be more informative because
it performs other prediction-related tasks. Standard notions of validity focus
on the former, i.e., coverage probability bounds for prediction regions, but a
notion of validity relevant to the other prediction-related tasks performed by
the latter is lacking. In this paper, we present a new notion---strong
prediction validity---relevant to these more general prediction tasks. We show
that strong validity is connected to more familiar notions of coherence, and
argue that imprecise probability considerations are required in order to
achieve it. We go on to show that strong prediction validity can be achieved by
interpreting the conformal prediction output as the contour function of a
consonant plausibility function. We also offer an alternative characterization,
based on a new nonparametric inferential model construction, wherein the
appearance of consonance is more natural, and prove strong prediction validity.Comment: 34 pages, 3 figures, 2 tables. Comments welcome at
https://www.researchers.one/article/2020-01-1
Recommended from our members
Efficient segmentation based on Eikonal and diffusion equations
Segmentation of regions of interest in an image has important applications in medical image analysis, particularly in computer aided diagnosis. Segmentation can enable further quantitative analysis of anatomical structures. We present efficient image segmentation schemes based on the solution of distinct partial differential equations (PDEs). For each known image region, a PDE is solved, the solution of which locally represents the weighted distance from a region known to have a certain segmentation label. To achieve this goal, we propose the use of two separate PDEs, the Eikonal equation and a diffusion equation. In each method, the segmentation labels are obtained by a competition criterion between the solutions to the PDEs corresponding to each region. We discuss how each method applies the concept of information propagation from the labelled image regions to the unknown image regions. Experimental results are presented on magnetic resonance, computed tomography, and ultrasound images and for both two-region and multi-region segmentation problems. These results demonstrate the high level of efficiency as well as the accuracy of the proposed methods
Stream network analysis and geomorphic flood plain mapping from orbital and suborbital remote sensing imagery application to flood hazard studies in central Texas
The author has identified the following significant results. Development of a quantitative hydrogeomorphic approach to flood hazard evaluation was hindered by (1) problems of resolution and definition of the morphometric parameters which have hydrologic significance, and (2) mechanical difficulties in creating the necessary volume of data for meaningful analysis. Measures of network resolution such as drainage density and basin Shreve magnitude indicated that large scale topographic maps offered greater resolution than small scale suborbital imagery and orbital imagery. The disparity in network resolution capabilities between orbital and suborbital imagery formats depends on factors such as rock type, vegetation, and land use. The problem of morphometric data analysis was approached by developing a computer-assisted method for network analysis. The system allows rapid identification of network properties which can then be related to measures of flood response
- ā¦