139 research outputs found
Estimating Mutual Information
We present two classes of improved estimators for mutual information
, from samples of random points distributed according to some joint
probability density . In contrast to conventional estimators based on
binnings, they are based on entropy estimates from -nearest neighbour
distances. This means that they are data efficient (with we resolve
structures down to the smallest possible scales), adaptive (the resolution is
higher where data are more numerous), and have minimal bias. Indeed, the bias
of the underlying entropy estimates is mainly due to non-uniformity of the
density at the smallest resolved scale, giving typically systematic errors
which scale as functions of for points. Numerically, we find that
both families become {\it exact} for independent distributions, i.e. the
estimator vanishes (up to statistical fluctuations) if . This holds for all tested marginal distributions and for all
dimensions of and . In addition, we give estimators for redundancies
between more than 2 random variables. We compare our algorithms in detail with
existing algorithms. Finally, we demonstrate the usefulness of our estimators
for assessing the actual independence of components obtained from independent
component analysis (ICA), for improving ICA, and for estimating the reliability
of blind source separation.Comment: 16 pages, including 18 figure
What Price Recreation in Finland?—A Contingent Valuation Study of Non-Market Benefits of Public Outdoor Recreation Areas
Basic services in Finnish national parks and state-owned recreation areas have traditionally been publicly financed and thus free of charge for users. Since the benefits of public recreation are not captured by market demand, government spending on recreation services must be motivated in some other way. Here, we elicit people’s willingness to pay (WTP) for services in the country’s state-owned parks to obtain an estimate of the value of outdoor recreation in monetary terms. A variant of the Tobit model is used in the econometric analysis to examine the WTP responses elicited by a payment card format. We also study who the current users of recreation services are in order to enable policymakers to anticipate the redistribution effects of a potential implementation of user fees. Finally, we discuss the motives for WTP, which reveal concerns such as equity and ability to pay that are relevant for planning public recreation in general and for the introduction of fees in particular
Real-Time Definition of Non-Randomness in the Distribution of Genomic Events
Features such as mutations or structural characteristics can be non-randomly or non-uniformly distributed within a genome. So far, computer simulations were required for statistical inferences on the distribution of sequence motifs. Here, we show that these analyses are possible using an analytical, mathematical approach. For the assessment of non-randomness, our calculations only require information including genome size, number of (sampled) sequence motifs and distance parameters. We have developed computer programs evaluating our analytical formulas for the real-time determination of expected values and p-values. This approach permits a flexible cluster definition that can be applied to most effectively identify non-random or non-uniform sequence motif distribution. As an example, we show the effectivity and reliability of our mathematical approach in clinical retroviral vector integration site distribution
- …