102,531 research outputs found
Basic statistics for probabilistic symbolic variables: a novel metric-based approach
In data mining, it is usually to describe a set of individuals using some
summaries (means, standard deviations, histograms, confidence intervals) that
generalize individual descriptions into a typology description. In this case,
data can be described by several values. In this paper, we propose an approach
for computing basic statics for such data, and, in particular, for data
described by numerical multi-valued variables (interval, histograms, discrete
multi-valued descriptions). We propose to treat all numerical multi-valued
variables as distributional data, i.e. as individuals described by
distributions. To obtain new basic statistics for measuring the variability and
the association between such variables, we extend the classic measure of
inertia, calculated with the Euclidean distance, using the squared Wasserstein
distance defined between probability measures. The distance is a generalization
of the Wasserstein distance, that is a distance between quantile functions of
two distributions. Some properties of such a distance are shown. Among them, we
prove the Huygens theorem of decomposition of the inertia. We show the use of
the Wasserstein distance and of the basic statistics presenting a k-means like
clustering algorithm, for the clustering of a set of data described by modal
numerical variables (distributional variables), on a real data set. Keywords:
Wasserstein distance, inertia, dependence, distributional data, modal
variables.Comment: 19 pages, 3 figure
Using the Bootstrap Method for a Statistical Significance Test of Differences between Summary Histograms
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed
Accurate Distances Measures and Machine Learning of the Texture-Property Relation for Crystallographic Textures Represented by One-Point Statistics
The crystallographic texture of metallic materials is a key microstructural
feature that is responsible for the anisotropic behavior, e.g., important in
forming operations. In materials science, crystallographic texture is commonly
described by the orientation distribution function, which is defined as the
probability density function of the orientations of the monocrystal grains
conforming a polycrystalline material. For representing the orientation
distribution function, there are several approaches such as using generalized
spherical harmonics, orientation histograms, and pole figure images . Measuring
distances between crystallographic textures is essential for any task that
requires assessing texture similarities, e.g. to guide forming processes.
Therefore, we introduce novel distance measures based on (i) the Earth Movers
Distance that takes into account local distance information encoded in
histogram-based texture representations and (ii) a distance measure based on
pole figure images. For this purpose, we evaluate and compare existing distance
measures for selected use-cases. The present study gives insights into
advantages and drawbacks of using certain texture representations and distance
measures with emphasis on applications in materials design and optimal process
control
Transition from tunneling to direct contact in tungsten nanojunctions
We apply the mechanically controllable break junctions technique to
investigate the transition from tunneling to direct contact in tungsten. This
transition is quite different from that of other metals and is determined by
the local electronic properties of the tungsten surface and the relief of the
electrodes at the point of their closest proximity. The conductance traces show
a rich variety of patterns from the avalanche-like jump to a mesoscopic contact
to the completely smooth transition between direct contact and tunneling. Due
to the occasional absence of an adhesive jump the conductance of the contact
can be continuously monitored at ultra-small electrode separations. The
conductance histograms of tungsten are either featureless or show two distinct
peaks related to the sequential opening of spatially separated groups of
conductance channels. The role of surface states of tungsten and their
contribution to the junction conductance at sub-Angstrom electrode separations
are discussed.Comment: 6 pages, 6 figure
Formation and properties of metal-oxygen atomic chains
Suspended chains consisting of single noble metal and oxygen atoms have been
formed. We provide evidence that oxygen can react with and be incorporated into
metallic one-dimensional atomic chains. Oxygen incorporation reinforces the
linear bonds in the chain, which facilitates the creation of longer atomic
chains. The mechanical and electrical properties of these diatomic chains have
been investigated by determining local vibration modes of the chain and by
measuring the dependence of the average chain-conductance on the length of the
chain. Additionally, we have performed calculations that give insight in the
physical mechanism of the oxygen-induced strengthening of the linear bonds and
the conductance of the metal-oxygen chains.Comment: 10 pages, 9 fig
Recommended from our members
Analyzing Citation-Distance Networks for Evaluating Publication Impact
Studying citation patterns of scholarly articles has been of interest to many researchers from various disciplines. While the relationship of citations and scientific impact has been widely studied in the literature, in this paper we develop the idea of analyzing the semantic distance of scholarly articles in a citation network (citation-distance network) to uncover patterns that reflect scientific impact. More specifically, we compare two types of publications in terms of their citation-distance patterns, seminal publications and literature reviews, and focus on their referencing patterns as well as on publications which cite them. We show that seminal publications are associated with a larger semantic distance, measured using the content of the articles, between their references and the citing publications, while literature reviews tend to cite publications from a wider range of topics. Our motivation is to understand and utilize this information to create new research evaluation metrics which would better reflect scientific impact
- …