1,236 research outputs found
MDL Convergence Speed for Bernoulli Sequences
The Minimum Description Length principle for online sequence
estimation/prediction in a proper learning setup is studied. If the underlying
model class is discrete, then the total expected square loss is a particularly
interesting performance measure: (a) this quantity is finitely bounded,
implying convergence with probability one, and (b) it additionally specifies
the convergence speed. For MDL, in general one can only have loss bounds which
are finite but exponentially larger than those for Bayes mixtures. We show that
this is even the case if the model class contains only Bernoulli distributions.
We derive a new upper bound on the prediction error for countable Bernoulli
classes. This implies a small bound (comparable to the one for Bayes mixtures)
for certain important model classes. We discuss the application to Machine
Learning tasks such as classification and hypothesis testing, and
generalization to countable classes of i.i.d. models.Comment: 28 page
Challenges of Religious Literacy in Education : Islam and the Governance of Religious Diversity in Multi-faith Schools
This chapter seeks take part in an emerging research where religion is approached as a whole school endeavor. Previous research and policy recommendations typically focused on teaching about religion in school, but the accommodation of religious diversity in the wider school culture merits more attention. Based on observations in our multiple case studies, we discuss the multi-level governance of religious diversity in Finnish multi-faith schools with a particular focus on the challenges of religious literacy for educators. The three examples we present focus on the inclusion of Muslims in Finnish schools and in particular on the challenges for educator (1) in interpreting the distinction between religion and culture, (2) in recognizing and handling intra-religious diversity, and (3) in being aware of Protestant conceptions of religion and culture. A theme cutting across these examples is how they reflect the tendencies either to see different situations merely through the lens of religion (religionisation), or not to recognize the importance of religion at all (religion-blindness). We argue that religious literacy should be recognized and developed as a vital part of the intercultural competencies of educators.Peer reviewe
Does antibacterial treatment for urinary tract infection contribute to the risk of breast cancer?
Low lignan status has been reported to be related to an elevated risk of breast cancer. Since lignan status is reduced by antibacterial medications, it is plausible to hypothesize that repeated use of antibiotics may also be a risk factor for breast cancer. History of treatment for urinary tract infection was studied for its prediction of breast cancer among 9461 Finnish women 19–89 years of age and initially cancer-free. During a follow-up in 1973–1991, a total of 157 breast cancer cases were diagnosed. Women reporting previous or present medication for urinary tract infection at baseline showed an elevated breast cancer risk in comparison with other women. The age-adjusted relative risk was 1.34 (95% confidence interval (CI) = 0.98–1.83). The association was concentrated to women under 50 years of age. The relative risk for these women was 1.74 (95% CI 1.13–2.68), whereas it was 0.97 (95% CI 0.59–1.58) for older women. The relative risk in the younger age-group was 1.47 (95% CI 0.73–2.97) during the first 10 years of follow-up, and 1.93 (95% CI 1.11–3.37) for follow-up times longer than 10 years. These data suggest that premenopausal women using long-term medication for urinary tract infections show a possible elevated risk of future breast cancer. The results are, however, still inconclusive and the hypothesis needs to be tested by other studies. © 2000 Cancer ResearchCampaig
Detecting periodicity in experimental data using linear modeling techniques
Fourier spectral estimates and, to a lesser extent, the autocorrelation
function are the primary tools to detect periodicities in experimental data in
the physical and biological sciences. We propose a new method which is more
reliable than traditional techniques, and is able to make clear identification
of periodic behavior when traditional techniques do not. This technique is
based on an information theoretic reduction of linear (autoregressive) models
so that only the essential features of an autoregressive model are retained.
These models we call reduced autoregressive models (RARM). The essential
features of reduced autoregressive models include any periodicity present in
the data. We provide theoretical and numerical evidence from both experimental
and artificial data, to demonstrate that this technique will reliably detect
periodicities if and only if they are present in the data. There are strong
information theoretic arguments to support the statement that RARM detects
periodicities if they are present. Surrogate data techniques are used to ensure
the converse. Furthermore, our calculations demonstrate that RARM is more
robust, more accurate, and more sensitive, than traditional spectral
techniques.Comment: 10 pages (revtex) and 6 figures. To appear in Phys Rev E. Modified
styl
Algorithmic statistics revisited
The mission of statistics is to provide adequate statistical hypotheses
(models) for observed data. But what is an "adequate" model? To answer this
question, one needs to use the notions of algorithmic information theory. It
turns out that for every data string one can naturally define
"stochasticity profile", a curve that represents a trade-off between complexity
of a model and its adequacy. This curve has four different equivalent
definitions in terms of (1)~randomness deficiency, (2)~minimal description
length, (3)~position in the lists of simple strings and (4)~Kolmogorov
complexity with decompression time bounded by busy beaver function. We present
a survey of the corresponding definitions and results relating them to each
other
Affect systems, changes in body mass index, disordered eating and stress: An 18-month longitudinal study in women
© 2017 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background: Evidence suggests that stress plays a role in changes in body weight and disordered eating. The present study examined the effect of mood, affect systems (attachment and social rank) and affect regulatory processes (self-criticism, self-reassurance) on the stress process and how this impacts on changes in weight and disordered eating. Methods: A large sample women participated in a community-based prospective, longitudinal online study in which measures of body mass index (BMI), disordered eating, perceived stress, attachment, social rank, mood, and self-criticism/reassurance were measured at 6-monthly intervals over an 18 month period. Results: Latent Growth Curve Modelling showed that BMI increased over 18 months while stress and disordered eating decreased and that these changes were predicted by high baseline levels of these constructs. Independently of this, however, increases in stress predicted a reduction in BMI which was, itself, predicted by baseline levels of self-hatred and unfavourable social comparison. Conclusions: This study adds support to the evidence that stress is important in weight change. In addition, this is the first study to show in a longitudinal design, that social rank and self-criticism (as opposed to self-reassurance) at times of difficulty predict increases in stress and, thus, suggests a role for these constructs in weight regulation.Peer reviewedFinal Published versio
Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies
Existing sequence alignment algorithms use heuristic scoring schemes which
cannot be used as objective distance metrics. Therefore one relies on measures
like the p- or log-det distances, or makes explicit, and often simplistic,
assumptions about sequence evolution. Information theory provides an
alternative, in the form of mutual information (MI) which is, in principle, an
objective and model independent similarity measure. MI can be estimated by
concatenating and zipping sequences, yielding thereby the "normalized
compression distance". So far this has produced promising results, but with
uncontrolled errors. We describe a simple approach to get robust estimates of
MI from global pairwise alignments. Using standard alignment algorithms, this
gives for animal mitochondrial DNA estimates that are strikingly close to
estimates obtained from the alignment free methods mentioned above. Our main
result uses algorithmic (Kolmogorov) information theory, but we show that
similar results can also be obtained from Shannon theory. Due to the fact that
it is not additive, normalized compression distance is not an optimal metric
for phylogenetics, but we propose a simple modification that overcomes the
issue of additivity. We test several versions of our MI based distance measures
on a large number of randomly chosen quartets and demonstrate that they all
perform better than traditional measures like the Kimura or log-det (resp.
paralinear) distances. Even a simplified version based on single letter Shannon
entropies, which can be easily incorporated in existing software packages, gave
superior results throughout the entire animal kingdom. But we see the main
virtue of our approach in a more general way. For example, it can also help to
judge the relative merits of different alignment algorithms, by estimating the
significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia
Algorithmic statistics: forty years later
Algorithmic statistics has two different (and almost orthogonal) motivations.
From the philosophical point of view, it tries to formalize how the statistics
works and why some statistical models are better than others. After this notion
of a "good model" is introduced, a natural question arises: it is possible that
for some piece of data there is no good model? If yes, how often these bad
("non-stochastic") data appear "in real life"?
Another, more technical motivation comes from algorithmic information theory.
In this theory a notion of complexity of a finite object (=amount of
information in this object) is introduced; it assigns to every object some
number, called its algorithmic complexity (or Kolmogorov complexity).
Algorithmic statistic provides a more fine-grained classification: for each
finite object some curve is defined that characterizes its behavior. It turns
out that several different definitions give (approximately) the same curve.
In this survey we try to provide an exposition of the main results in the
field (including full proofs for the most important ones), as well as some
historical comments. We assume that the reader is familiar with the main
notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde
Observations of ozone depletion events in a Finnish boreal forest
We investigated the concentrations and vertical profiles of ozone over a 20-year period (1996–2016) at the SMEAR II station in southern Finland. Our results showed that the typical daily median ozone concentrations were in the range of 20–50 ppb with clear diurnal and annual patterns. In general, the profile of ozone concentrations illustrated an increase as a function of heights. The main aim of our study was to address the frequency and strength of ozone depletion events at this boreal forest site. We observed more than a thousand of 10 min periods at 4.2 m, with ozone concentrations below 10 ppb, and a few tens of cases with ozone concentrations below 2 ppb. Among these observations, a number of ozone depletion events that lasted for more than 3 h were identified, and they occurred mainly in autumn and winter months. The low ozone concentrations were likely related to the formation of a low mixing layer under the conditions of low temperatures, low wind speeds, high relative humidities and limited intensity of solar radiation.Peer reviewe
- …