1,437 research outputs found
MDL Convergence Speed for Bernoulli Sequences
The Minimum Description Length principle for online sequence
estimation/prediction in a proper learning setup is studied. If the underlying
model class is discrete, then the total expected square loss is a particularly
interesting performance measure: (a) this quantity is finitely bounded,
implying convergence with probability one, and (b) it additionally specifies
the convergence speed. For MDL, in general one can only have loss bounds which
are finite but exponentially larger than those for Bayes mixtures. We show that
this is even the case if the model class contains only Bernoulli distributions.
We derive a new upper bound on the prediction error for countable Bernoulli
classes. This implies a small bound (comparable to the one for Bayes mixtures)
for certain important model classes. We discuss the application to Machine
Learning tasks such as classification and hypothesis testing, and
generalization to countable classes of i.i.d. models.Comment: 28 page
Associations of tissue transglutaminase antibody seropositivity with coronary heart disease: Findings from a prospective cohort study.
Clinical experience and observational studies suggest that individuals with coeliac disease are at increased risk of coronary heart disease (CHD), but the precise mechanism for this is unclear. Laboratory studies suggest that it may relate to tissue transglutaminase antibodies (tTGAs). Our aim was to examine whether seropositivity for tTGA and endomysial antibodies (EMAs) are associated with incident CHD in humans.
We used data from Mini-Finland Health Survey, a prospective cohort study of Finnish men and women aged 35-80 at study baseline 1978-80. TTGA and EMA seropositivities were ascertained from baseline blood samples and incident CHD events were identified from national hospitalisation and death registers. Cox regression was used to examine the associations between antibody seropositivity and incident CHD. Of 6887 men and women, 562 were seropositive for tTGAs and 72 for EMAs. During a median follow-up of 26 years, 2367 individuals experienced a CHD event. We found no clear evidence for an association between tTGA positivity (hazard ratio, HR: 1.04, 95% confidence interval, CI: 0.83, 1.30) or EMA positivity (HR: 1.16, 95% CI: 0.77, 1.74) and incident CHD, once pre-existing CVD and known CHD risk factors had been adjusted for.
We found no clear evidence for an association of tTGA or EMA seropositivity with incident CHD outcomes, suggesting that tTG autoimmunity is unlikely to be the biological link between coeliac disease and CHD
Does antibacterial treatment for urinary tract infection contribute to the risk of breast cancer?
Low lignan status has been reported to be related to an elevated risk of breast cancer. Since lignan status is reduced by antibacterial medications, it is plausible to hypothesize that repeated use of antibiotics may also be a risk factor for breast cancer. History of treatment for urinary tract infection was studied for its prediction of breast cancer among 9461 Finnish women 19â89 years of age and initially cancer-free. During a follow-up in 1973â1991, a total of 157 breast cancer cases were diagnosed. Women reporting previous or present medication for urinary tract infection at baseline showed an elevated breast cancer risk in comparison with other women. The age-adjusted relative risk was 1.34 (95% confidence interval (CI) = 0.98â1.83). The association was concentrated to women under 50 years of age. The relative risk for these women was 1.74 (95% CI 1.13â2.68), whereas it was 0.97 (95% CI 0.59â1.58) for older women. The relative risk in the younger age-group was 1.47 (95% CI 0.73â2.97) during the first 10 years of follow-up, and 1.93 (95% CI 1.11â3.37) for follow-up times longer than 10 years. These data suggest that premenopausal women using long-term medication for urinary tract infections show a possible elevated risk of future breast cancer. The results are, however, still inconclusive and the hypothesis needs to be tested by other studies. © 2000 Cancer ResearchCampaig
Dietary fat, cholesterol and colorectal cancer in a prospective study
The relationships between consumption of total fat, major dietary fatty acids, cholesterol, consumption of meat and eggs, and the incidence of colorectal cancers were studied in a cohort based on the Finnish Mobile Clinic Health Examination Survey. Baseline (1967â1972) information on habitual food consumption over the preceding year was collected from 9959 men and women free of diagnosed cancer. A total of 109 new colorectal cancer cases were ascertained late 1999. High cholesterol intake was associated with increased risk for colorectal cancers. The relative risk between the highest and lowest quartiles of dietary cholesterol was 3.26 (95% confidence interval 1.54â6.88) after adjusting for age, sex, body mass index, occupation, smoking, geographic region, energy intake and consumption of vegetables, fruits and cereals. Consumption of total fat and intake of saturated, monounsaturated, or polyunsaturated fatty acids were not significantly associated with colorectal cancer risk. Nonsignificant associations were found between consumption of meat and eggs and colorectal cancer risk. The results of the present study indicate that high cholesterol intake may increase colorectal cancer risk, but do not suggest the presence of significant effects of dietary fat intake on colorectal cancer incidence. © 2001 Cancer Research Campaign http://www.bjcancer.co
Mutual Information of Population Codes and Distance Measures in Probability Space
We studied the mutual information between a stimulus and a large system
consisting of stochastic, statistically independent elements that respond to a
stimulus. The Mutual Information (MI) of the system saturates exponentially
with system size. A theory of the rate of saturation of the MI is developed. We
show that this rate is controlled by a distance function between the response
probabilities induced by different stimuli. This function, which we term the
{\it Confusion Distance} between two probabilities, is related to the Renyi
-Information.Comment: 11 pages, 3 figures, accepted to PR
An aldehyde as a rapid source of secondary aerosol precursors: theoretical and experimental study of hexanal autoxidation
Aldehydes are common constituents of natural and polluted atmospheres,
and their gas-phase oxidation has recently been reported to yield
highly oxygenated organic molecules (HOMs) that are key players in the
formation of atmospheric aerosol. However, insights into the molecular-level mechanism of this oxidation reaction have been scarce. While OH
initiated oxidation of small aldehydes, with two to five carbon atoms,
under high-NOx conditions generally leads to
fragmentation products, longer-chain aldehydes involving an initial
non-aldehydic hydrogen abstraction can be a path to molecular
functionalization and growth. In this work, we conduct a joint
theoreticalâexperimental analysis of the autoxidation chain reaction
of a common aldehyde, hexanal. We computationally study the initial
steps of OH oxidation at the
RHF-RCCSD(T)-F12a/VDZ-F12//ÏB97X-D/aug-cc-pVTZ level and show that
both aldehydic (on C1) and non-aldehydic (on C4) H-abstraction
channels contribute to HOMs via autoxidation. The oxidation products
predominantly form through the HÂ abstraction from C1 and C4, followed
by fast unimolecular 1,6 H-shifts with rate coefficients of 1.7Ă10-1 and 8.6Ă10-1âsâ1, respectively.
Experimental flow reactor measurements at variable reaction times show
that hexanal oxidation products including HOM monomers up to
C6H11O7 and accretion products C12H22O9â10
form within 3âs reaction time. Kinetic modeling simulations including
atmospherically relevant precursor concentrations agree with the
experimental results and the expected timescales. Finally, we estimate
the hexanal HOM yields up to seven O atoms with mechanistic details
through both C1 and C4 channels.</p
Nonparametric Hierarchical Clustering of Functional Data
In this paper, we deal with the problem of curves clustering. We propose a
nonparametric method which partitions the curves into clusters and discretizes
the dimensions of the curve points into intervals. The cross-product of these
partitions forms a data-grid which is obtained using a Bayesian model selection
approach while making no assumptions regarding the curves. Finally, a
post-processing technique, aiming at reducing the number of clusters in order
to improve the interpretability of the clustering, is proposed. It consists in
optimally merging the clusters step by step, which corresponds to an
agglomerative hierarchical classification whose dissimilarity measure is the
variation of the criterion. Interestingly this measure is none other than the
sum of the Kullback-Leibler divergences between clusters distributions before
and after the merges. The practical interest of the approach for functional
data exploratory analysis is presented and compared with an alternative
approach on an artificial and a real world data set
FingerReader: A Wearable Device to Support Text Reading on the Go
Visually impaired people report numerous difficulties with accessing printed text using existing technology, including problems with alignment, focus, accuracy, mobility and efficiency. We present a finger worn device that assists the visually impaired with effectively and efficiently reading paper-printed text. We introduce a novel, local-sequential manner for scanning text which enables reading single lines, blocks of text or skimming the text for important sections while providing real-time auditory and tactile feedback. The design is motivated by preliminary studies with visually impaired people, and it is small-scale and mobile, which enables a more manageable operation with little setup
- âŠ