1,593 research outputs found

    MDL Convergence Speed for Bernoulli Sequences

    Get PDF
    The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MDL, in general one can only have loss bounds which are finite but exponentially larger than those for Bayes mixtures. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. We discuss the application to Machine Learning tasks such as classification and hypothesis testing, and generalization to countable classes of i.i.d. models.Comment: 28 page

    A new determination of the orbit and masses of the Be binary system delta Scorpii

    Full text link
    The binary star delta Sco (HD143275) underwent remarkable brightening in the visible in 2000, and continues to be irregularly variable. The system was observed with the Sydney University Stellar Interferometer (SUSI) in 1999, 2000, 2001, 2006 and 2007. The 1999 observations were consistent with predictions based on the previously published orbital elements. The subsequent observations can only be explained by assuming that an optically bright emission region with an angular size of > 2 +/- 1 mas formed around the primary in 2000. By 2006/2007 the size of this region grew to an estimated > 4 mas. We have determined a consistent set of orbital elements by simultaneously fitting all the published interferometric and spectroscopic data as well as the SUSI data reported here. The resulting elements and the brightness ratio for the system measured prior to the outburst in 2000 have been used to estimate the masses of the components. We find Ma = 15 +/- 7 Msun and Mb = 8.0 +/- 3.6 Msun. The dynamical parallax is estimated to be 7.03 +/- 0.15 mas, which is in good agreement with the revised HIPPARCOS parallax.Comment: 8 pages, 4 figs. Accepted for publication in MNRA

    Dietary fat, cholesterol and colorectal cancer in a prospective study

    Get PDF
    The relationships between consumption of total fat, major dietary fatty acids, cholesterol, consumption of meat and eggs, and the incidence of colorectal cancers were studied in a cohort based on the Finnish Mobile Clinic Health Examination Survey. Baseline (1967–1972) information on habitual food consumption over the preceding year was collected from 9959 men and women free of diagnosed cancer. A total of 109 new colorectal cancer cases were ascertained late 1999. High cholesterol intake was associated with increased risk for colorectal cancers. The relative risk between the highest and lowest quartiles of dietary cholesterol was 3.26 (95% confidence interval 1.54–6.88) after adjusting for age, sex, body mass index, occupation, smoking, geographic region, energy intake and consumption of vegetables, fruits and cereals. Consumption of total fat and intake of saturated, monounsaturated, or polyunsaturated fatty acids were not significantly associated with colorectal cancer risk. Nonsignificant associations were found between consumption of meat and eggs and colorectal cancer risk. The results of the present study indicate that high cholesterol intake may increase colorectal cancer risk, but do not suggest the presence of significant effects of dietary fat intake on colorectal cancer incidence. © 2001 Cancer Research Campaign http://www.bjcancer.co

    Chains of infinite order, chains with memory of variable length, and maps of the interval

    Full text link
    We show how to construct a topological Markov map of the interval whose invariant probability measure is the stationary law of a given stochastic chain of infinite order. In particular we caracterize the maps corresponding to stochastic chains with memory of variable length. The problem treated here is the converse of the classical construction of the Gibbs formalism for Markov expanding maps of the interval

    PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

    Get PDF
    The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection

    FingerReader: A Wearable Device to Support Text Reading on the Go

    Get PDF
    Visually impaired people report numerous difficulties with accessing printed text using existing technology, including problems with alignment, focus, accuracy, mobility and efficiency. We present a finger worn device that assists the visually impaired with effectively and efficiently reading paper-printed text. We introduce a novel, local-sequential manner for scanning text which enables reading single lines, blocks of text or skimming the text for important sections while providing real-time auditory and tactile feedback. The design is motivated by preliminary studies with visually impaired people, and it is small-scale and mobile, which enables a more manageable operation with little setup

    Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies

    Get PDF
    Existing sequence alignment algorithms use heuristic scoring schemes which cannot be used as objective distance metrics. Therefore one relies on measures like the p- or log-det distances, or makes explicit, and often simplistic, assumptions about sequence evolution. Information theory provides an alternative, in the form of mutual information (MI) which is, in principle, an objective and model independent similarity measure. MI can be estimated by concatenating and zipping sequences, yielding thereby the "normalized compression distance". So far this has produced promising results, but with uncontrolled errors. We describe a simple approach to get robust estimates of MI from global pairwise alignments. Using standard alignment algorithms, this gives for animal mitochondrial DNA estimates that are strikingly close to estimates obtained from the alignment free methods mentioned above. Our main result uses algorithmic (Kolmogorov) information theory, but we show that similar results can also be obtained from Shannon theory. Due to the fact that it is not additive, normalized compression distance is not an optimal metric for phylogenetics, but we propose a simple modification that overcomes the issue of additivity. We test several versions of our MI based distance measures on a large number of randomly chosen quartets and demonstrate that they all perform better than traditional measures like the Kimura or log-det (resp. paralinear) distances. Even a simplified version based on single letter Shannon entropies, which can be easily incorporated in existing software packages, gave superior results throughout the entire animal kingdom. But we see the main virtue of our approach in a more general way. For example, it can also help to judge the relative merits of different alignment algorithms, by estimating the significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia

    Handwritten digit recognition by bio-inspired hierarchical networks

    Full text link
    The human brain processes information showing learning and prediction abilities but the underlying neuronal mechanisms still remain unknown. Recently, many studies prove that neuronal networks are able of both generalizations and associations of sensory inputs. In this paper, following a set of neurophysiological evidences, we propose a learning framework with a strong biological plausibility that mimics prominent functions of cortical circuitries. We developed the Inductive Conceptual Network (ICN), that is a hierarchical bio-inspired network, able to learn invariant patterns by Variable-order Markov Models implemented in its nodes. The outputs of the top-most node of ICN hierarchy, representing the highest input generalization, allow for automatic classification of inputs. We found that the ICN clusterized MNIST images with an error of 5.73% and USPS images with an error of 12.56%

    Challenges of Religious Literacy in Education : Islam and the Governance of Religious Diversity in Multi-faith Schools

    Get PDF
    This chapter seeks take part in an emerging research where religion is approached as a whole school endeavor. Previous research and policy recommendations typically focused on teaching about religion in school, but the accommodation of religious diversity in the wider school culture merits more attention. Based on observations in our multiple case studies, we discuss the multi-level governance of religious diversity in Finnish multi-faith schools with a particular focus on the challenges of religious literacy for educators. The three examples we present focus on the inclusion of Muslims in Finnish schools and in particular on the challenges for educator (1) in interpreting the distinction between religion and culture, (2) in recognizing and handling intra-religious diversity, and (3) in being aware of Protestant conceptions of religion and culture. A theme cutting across these examples is how they reflect the tendencies either to see different situations merely through the lens of religion (religionisation), or not to recognize the importance of religion at all (religion-blindness). We argue that religious literacy should be recognized and developed as a vital part of the intercultural competencies of educators.Peer reviewe

    Compressing web Geodata for real-time environmental applications

    Get PDF
    The advent of connected mobile devices has caused an unprecedented availability of geo-referenced user-generated content, which can be exploited for environment monitoring. In particular, Augmented Reality (AR) mobile applications can be designed to enable citizens collect observations, by overlaying relevant meta-data on their current view. This class of applications rely on multiple meta-data, which must be properly compressed for transmission and real-time usage. This paper presents a two-stage approach for the compression of Digital Elevation Model (DEM) data and geographic entities for a mountain environment monitoring mobile AR application. The proposed method is generic and could be applied to other types of geographical data
    corecore