196 research outputs found

    Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer

    Full text link
    Transformer architecture has shown impressive performance in multiple research domains and has become the backbone of many neural network models. However, there is limited understanding on how it works. In particular, with a simple predictive loss, how the representation emerges from the gradient \emph{training dynamics} remains a mystery. In this paper, for 1-layer transformer with one self-attention layer plus one decoder layer, we analyze its SGD training dynamics for the task of next token prediction in a mathematically rigorous manner. We open the black box of the dynamic process of how the self-attention layer combines input tokens, and reveal the nature of underlying inductive bias. More specifically, with the assumption (a) no positional encoding, (b) long input sequence, and (c) the decoder layer learns faster than the self-attention layer, we prove that self-attention acts as a \emph{discriminative scanning algorithm}: starting from uniform attention, it gradually attends more to distinct key tokens for a specific next token to be predicted, and pays less attention to common key tokens that occur across different next tokens. Among distinct tokens, it progressively drops attention weights, following the order of low to high co-occurrence between the key and the query token in the training set. Interestingly, this procedure does not lead to winner-takes-all, but decelerates due to a \emph{phase transition} that is controllable by the learning rates of the two layers, leaving (almost) fixed token combination. We verify this \textbf{\emph{scan and snap}} dynamics on synthetic and real-world data (WikiText).Comment: Fix minor issues in the proofs and figures. Update figures to reflect the main conclusions more accuratel

    Nanoquartz in Late Permian C1 coal and the high incidence of female lung cancer in the Pearl River Origin area: a retrospective cohort study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Pearl River Origin area, Qujing District of Yunnan Province, has one of the highest female lung cancer mortality rates in China. Smoking was excluded as a cause of the lung cancer excess because almost all women were non-smokers. Crystalline silica embedded in the soot emissions from coal combustion was found to be associated with the lung cancer risk in a geographical correlation study. Lung cancer rates tend to be higher in places where the Late Permian C1 coal is produced. Therefore, we have hypothesized the two processes: C1 coal combustion --> nanoquartz in ambient air --> lung cancer excess in non-smoking women.</p> <p>Methods/Design</p> <p>We propose to conduct a retrospective cohort study to test the hypothesis above. We will search historical records and compile an inventory of the coal mines in operation during 1930–2009. To estimate the study subjects' retrospective exposure, we will reconstruct the historical exposure scenario by burning the coal samples, collected from operating or deserted coal mines by coal geologists, in a traditional firepit of an old house. Indoor air particulate samples will be collected for nanoquartz and polycyclic aromatic hydrocarbons (PAHs) analyses. Bulk quartz content will be quantified by X-ray diffraction analysis. Size distribution of quartz will be examined by electron microscopes and by centrifugation techniques. Lifetime cumulative exposure to nanoquartz will be estimated for each subject. Using the epidemiology data, we will examine whether the use of C1 coal and the cumulative exposure to nanoquartz are associated with an elevated risk of lung cancer.</p> <p>Discussion</p> <p>The high incidence rate of lung cancer in Xuan Wei, one of the counties in the current study area, was once attributed to high indoor air concentrations of PAHs. The research results have been cited for qualitative and quantitative cancer risk assessment of PAHs by the World Health Organization and other agencies. If nanoquartz is found to be the main underlying cause of the lung cancer epidemic in the study area, cancer potency estimates for PAHs by the international agencies based on the lung cancer data in this study setting should then be updated.</p

    Chronic Hepatitis B Virus Infection and Risk of Stroke Types: A Prospective Cohort Study of 500 000 Chinese Adults

    Get PDF
    BACKGROUND: Stroke is a leading cause of mortality and permanent disability in China, with large and unexplained geographic variations in rates of different stroke types. Chronic hepatitis B virus infection is prevalent among Chinese adults and may play a role in stroke cause. // METHODS: The prospective China Kadoorie Biobank included >500β€…000 adults aged 30 to 79 years who were recruited from 10 (5 urban and 5 rural) geographically diverse areas of China from 2004 to 2008, with determination of hepatitis B surface antigen (HBsAg) positivity at baseline. During 11 years of follow-up, a total of 59β€…117 incident stroke cases occurred, including 11β€…318 intracerebral hemorrhage (ICH), 49β€…971 ischemic stroke, 995 subarachnoid hemorrhage, and 3036 other/unspecified stroke. Cox regression models were used to estimate adjusted hazard ratios (HRs) for risk of stroke types associated with HBsAg positivity. In a subset of 17β€…833 participants, liver enzymes and lipids levels were measured and compared by HBsAg status. // RESULTS: Overall, 3.0% of participants were positive for HBsAg. HBsAg positivity was associated with an increased risk of ICH (adjusted HR, 1.29 [95% CI, 1.16–1.44]), similarly for fatal (n=5982; adjusted HR, 1.36 [95% CI, 1.16–1.59]) and nonfatal (n=5336; adjusted HR, 1.23 [95% CI, 1.06–1.44]) ICH. There were no significant associations of HBsAg positivity with risks of ischemic stroke (adjusted HR, 0.97 [95% CI, 0.92–1.03]), subarachnoid hemorrhage (adjusted HR, 0.87 [95% CI, 0.57–1.33]), or other/unspecified stroke (adjusted HR, 1.12 [95% CI, 0.89–1.42]). Compared with HBsAg-negative counterparts, HBsAg-positive individuals had lower lipid and albumin levels and higher liver enzyme levels. After adjustment for liver enzymes and albumin, the association with ICH from HBsAg positivity attenuated to 1.15 (0.90–1.48), suggesting possible mediation by abnormal liver function. // CONCLUSIONS: Among Chinese adults, chronic hepatitis B virus infection is associated with an increased risk of ICH but not other stroke types, which may be mediated through liver dysfunction and altered lipid metabolism

    Cancer risk from gaseous carbonyl compounds in indoor environment generated from household coal combustion in Xuanwei, China

    Get PDF
    Airborne carbonyls were characterized from emitted indoor coal combustion. Samples were collected in Xuanwei (Yunnan Province), a region in China with a high rate of lung cancer. Eleven of 19 types of samples (58%) demonstrated formaldehyde concentrations higher than the World Health Organization exposure limit (a 30-min average of 100 ΞΌg mβˆ’3). Different positive significant correlations between glyoxal/methylglyoxal and formaldehyde/acetaldehyde concentrations were observed, suggesting possible different characteristics in emissions between two pairs of carbonyl compounds. A sample in the highest inhalation risk shows 29.2 times higher risk than the lowest sample, suggesting different coal sampling locations could contribute to the variation of inhalation risk. Inhabitants in Xuanwei also tend to spend more time cooking and more days per year indoors than the national average. The calculated cancer risk ranged from 2.2–63 Γ— 10βˆ’5, which shows 13 types of samples at high-risk level. Cumulative effect in combination with different carbonyls could have contributed to the additive actual inhalation cancer risk. There is a need to explicitly address the health effects of environmentally relevant doses, considering life-long exposure in indoor dwellings
    • …
    corecore