294 research outputs found

    Large language models in machine translation

    Get PDF
    This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 trillion tokens, resulting in language models having up to 300 billion n-grams. It is capable of providing smoothed probabilities for fast, single-pass decoding. We introduce a new smoothing method, dubbed Stupid Backoff, that is inexpensive to train on large data sets and approaches the quality of Kneser-Ney Smoothing as the amount of training data increases.

    Second-Order Belief Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs) are learning methods for pattern recognition. The probabilistic HMMs have been one of the most used techniques based on the Bayesian model. First-order probabilistic HMMs were adapted to the theory of belief functions such that Bayesian probabilities were replaced with mass functions. In this paper, we present a second-order Hidden Markov Model using belief functions. Previous works in belief HMMs have been focused on the first-order HMMs. We extend them to the second-order model

    Intake of nitrate and nitrite and the risk of gastric cancer: a prospective cohort study.

    Get PDF
    The association between the intake of nitrate or nitrite and gastric cancer risk was investigated in a prospective cohort study started in 1986 in the Netherlands, of 120,852 men and women aged 55-69 years. At baseline, data on dietary intake, smoking habits and other covariates were collected by means of a self-administered questionnaire. For data analysis, a case-cohort approach was used, in which the person-years at risk were estimated from a randomly selected subcohort (1688 men and 1812 women). After 6.3 years of follow-up, 282 microscopically confirmed incident cases of stomach cancer were detected: 219 men and 63 women. We did not find a higher risk of gastric cancer among people with a higher nitrate intake from food [rate ratio (RR) highest/lowest quintile = 0.80, 95% CI 0.47-1.37, trend-P = 0.18], a higher nitrate intake from drinking water (RR highest/lowest quintile = 0.88, 95% CI 0.59-1.32, trend-P = 0.39) or a higher intake of nitrite (RR highest/lowest quintile = 1.44, 95% CI 0.95-2.18, trend-P = 0.24). Rate ratios for gastric cancer were also computed for each tertile of nitrate intake from foods within tertiles of vitamin C intake and intake of beta-carotene, but no consistent pattern was found. Therefore, our study does not support a positive association between the intake of nitrate or nitrite and gastric cancer risk

    Validation of a food frequency questionnaire to assess folate intake of Dutch elderly people

    Get PDF
    Folate is required for 1-carbon metabolism and deficiency in folate leads to megaloblastic anemia. Low levels of folate have been associated with increased risk of vascular disease. To investigate whether RDA of folate are met, habitual folate intake needs to be assessed reliably. We developed a FFQ to specifically measure folate intake over the previous 3 months in elderly people in the Netherlands. Major sources of folate intake, i.e. foods contributing to at least 80 % of the average folate intake, were identified through an analysis of the second Dutch Food Consumption Survey for the sub-population of men and women aged 50¿70. In 2000 and 2001, folate intake was estimated with this questionnaire in 1286 individuals aged 50-75 years. Concentrations of serum and erythrocyte folate served as biomarkers with which relative validity of the questionnaire was assessed. The same FFQ was repeated after 3 years in 803 subjects in order to assess long-term reproducibility. Mean folate intake was estimated to be 196 (sd 69) ¿g/d. Spearman correlation coefficients between folate intake and serum and erythrocyte concentrations were 0·14 (P <0·01) and 0·05 (P = 0·06) respectively. Spearman correlations between folate intakes measured at baseline and after 3 years were 0·58 (P <0·01). 47 % of the participants were classified in the same quartiles on the two occasions. Our FFQ showed a weak correlation between folate intake and blood folate concentrations and reproducibility was acceptable. This FFQ is able to rank subjects according to their folate intake

    Positive words carry less information than negative words

    Get PDF
    We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. We have analyzed three established lexica of affective word usage in English, German, and Spanish, to verify that these lexica have a neutral, unbiased, emotional content. Taking into account the frequency of word usage, we find that words with a positive emotional content are more frequently used. This lends support to Pollyanna hypothesis \cite{Boucher1969} that there should be a positive bias in human expression. We also find that negative words contain more information than positive words, as the informativeness of a word increases uniformly with its valence decrease. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links.Comment: 16 pages, 3 figures, 3 table

    Structural Invariance of Sunspot Umbrae Over the Solar Cycle: 1993-2004

    Full text link
    Measurements of maximum magnetic flux, minimum intensity, and size are presented for 12 967 sunspot umbrae detected on the NASA/NSO spectromagnetograms between 1993 and 2004 to study umbral structure and strength during the solar cycle. The umbrae are selected using an automated thresholding technique. Measured umbral intensities are first corrected for a confirming observation of umbral limb-darkening. Log-normal fits to the observed size distribution confirm that the size spectrum shape does not vary with time. The intensity-magnetic flux relationship is found to be steady over the solar cycle. The dependence of umbral size on the magnetic flux and minimum intensity are also independent of cycle phase and give linear and quadratic relations, respectively. While the large sample size does show a low amplitude oscillation in the mean minimum intensity and maximum magnetic flux correlated with the solar cycle, this can be explained in terms of variations in the mean umbral size. These size variations, however, are small and do not substantiate a meaningful change in the size spectrum of the umbrae generated by the Sun. Thus, in contrast to previous reports, the observations suggest the equilibrium structure, as testified by the invariant size-magnetic field relationship, as well as the mean size (i.e. strength) of sunspot umbrae do not significantly depend on solar cycle phase.Comment: 17 pages, 6 figures. Published in Solar Physic

    Politieke kennis en effecten van nieuws

    Get PDF
    Welk nieuws doet ertoe en hoeveel nieuws doet ertoe, en in welke mate hangt dit af van de politieke kennis van de ontvanger? Dit artikel beschrijft een longitudinale studie naar de verkiezingsstrijd voor het Nederlandse parlement in 2006

    The Effects of Media and their Logic on Legitimacy Sources within Local Governance Networks: A Three-Case Comparative Study

    Get PDF
    __Abstract__ Although theoretical and empirical work on the democratic legitimacy of governance networks is growing, little attention has been paid to the impact of mediatisation on democracies. Media have their own logic of news-making led by the media’s rules, aims, production routines and constraints, which affect political decision-making processes. In this article, we specifically study how media and their logic affect three democratic legitimacy sources of political decision-making within governance networks: voice, due deliberation and accountability. We conducted a comparative case study of three local governance networks using a mixed method design, combining extensive qualitative case studies, interviews and a quantitative content analysis of media reports. In all three cases, media logic increased voice possibilities for citizen groups. Furthermore, it broadened the deliberation process, although this did not improve the quality of this process per se, because the media focus on drama and negativity. Finally, media logic often pushed political authorities into a reactive communication style as they had to fight against negative images in the media. Proactive communication about projects, such as public relation (PR) strategies and branding, is difficult in such a media landscape
    • …
    corecore