37 research outputs found

    InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation

    Full text link
    Diffusion models have garnered considerable interest in the field of text generation. Several studies have explored text diffusion models with different structures and applied them to various tasks, including named entity recognition and summarization. However, there exists a notable disparity between the "easy-first" text generation process of current diffusion models and the "keyword-first" natural text generation process of humans, which has received limited attention. To bridge this gap, we propose InfoDiffusion, a non-autoregressive text diffusion model. Our approach introduces a "keyinfo-first" generation strategy and incorporates a noise schedule based on the amount of text information. In addition, InfoDiffusion combines self-conditioning with a newly proposed partially noising model structure. Experimental results show that InfoDiffusion outperforms the baseline model in terms of generation quality and diversity, as well as exhibiting higher sampling efficiency.Comment: EMNLP 2023 Finding

    Quantifying and predicting ecological and human health risks for binary heavy metal pollution accidents at the watershed scale using Bayesian Networks

    Get PDF
    The accidental leakage of industrial wastewater containing heavy metals from enterprises poses great risks to resident health, social instability, and ecological safety. During 2005-2018, heavy metal mixed pollution accidents comprised approximately 33% of the major environmental ones in China. A Bayesian Networks-based probabilistic approach is developed to quantitatively predict ecological and human health risks for heavy metal mixed pollution accidents at the watershed scale. To estimate the probability distributions of joint ecological exposure once a heavy metal mixed pollution accident occurs, a Copula-based joint exposure calculation method, comprised of a hydro-dynamic model, emergent heavy metal pollution transport model, and the Copula functions, is embedded. This approach was applied to the risk assessment of acute Cr6+-Hg2+ mixed pollution accidents at 76 electroplating enterprises in 24 risk sub-watersheds of the Dongjiang River downstream watershed. The results indicated that nine sub-watersheds created high ecological risks, while only five created high human health risks. In addition, the ecological and human health risk levels were highest in the tributary (the Xizhijiang River), while the ecological risk was more critical in the river network, and the human health risk was more serious in the mainstream of the Dongjiang River. The quantitative risk assessment provides a substantial support to incident prevention and control, risk management, as well as regulatory decision making for electroplating enterprises. (C) 2020 Elsevier Ltd. All rights reserved.Peer reviewe

    Acceptable Risk Analysis for Abrupt Environmental Pollution Accidents in Zhangjiakou City, China

    Get PDF
    Abrupt environmental pollution accidents cause considerable damage worldwide to the ecological environment, human health, and property. The concept of acceptable risk aims to answer whether or not a given environmental pollution risk exceeds a societally determined criterion. This paper presents a case study on acceptable environmental pollution risk conducted through a questionnaire survey carried out between August and October 2014 in five representative districts and two counties of Zhangjiakou City, Hebei Province, China. Here, environmental risk primarily arises from accidental water pollution, accidental air pollution, and tailings dam failure. Based on 870 valid questionnaires, demographic and regional differences in public attitudes towards abrupt environmental pollution risks were analyzed, and risk acceptance impact factors determined. The results showed females, people between 21-40 years of age, people with higher levels of education, public servants, and people with higher income had lower risk tolerance. People with lower perceived risk, low-level risk knowledge, high-level familiarity and satisfaction with environmental management, and without experience of environmental accidents had higher risk tolerance. Multiple logistic regression analysis indicated that public satisfaction with environmental management was the most significant factor in risk acceptance, followed by perceived risk of abrupt air pollution, occupation, perceived risk of tailings dam failure, and sex. These findings should be helpful to local decision-makers concerned with environmental risk management (e.g., selecting target groups for effective risk communication) in the context of abrupt environmental accidents

    Vertebral fractures among breast cancer survivors in China: a cross-sectional study of prevalence and health services gaps

    Get PDF
    Abstract Background Breast cancer survivors are at high risk for fracture due to cancer treatment-induced bone loss, however, data is scarce regarding the scope of this problem from an epidemiologic and health services perspective among Chinese women with breast cancer. Methods We designed a cross-sectional study comparing prevalence of vertebral fractures among age- and BMI-matched women from two cohorts. Women in the Breast Cancer Survivors cohort were enrolled from a large cancer hospital in Beijing. Eligibility criteria included age 50–70 years, initiation of treatment for breast cancer at least 5 years prior to enrollment, and no history of metabolic bone disease or bone metastases. Data collected included sociodemographic characteristics; fracture-related risk factors, screening and preventive measures; breast cancer history; and thoracolumbar x-ray. The matched comparator group was selected from participants enrolled in the Peking Vertebral Fracture Study, an independent cohort of healthy community-dwelling postmenopausal women from Beijing. Results Two hundred breast cancer survivors were enrolled (mean age 57.5 ± 4.9 years), and compared with 200 matched healthy women. Twenty-two (11%) vertebral fractures were identified among breast cancer survivors compared with 7 (3.5%) vertebral fractures in the comparison group, yielding an adjusted odds ratio for vertebral fracture of 4.16 (95%CI 1.69–10.21, p < 0.01). The majority had early stage (85.3%) and estrogen and/or progesterone receptor positive (84.6%) breast cancer. Approximately half of breast cancer survivors reported taking calcium supplements, 6.1% reported taking vitamin D supplements, and only 27% reported having a bone density scan since being diagnosed with breast cancer. Conclusions Despite a four-fold increased odds of prevalent vertebral fracture among Chinese breast cancer survivors in our study, rates of screening for osteoporosis and fracture risk were low reflecting a lack of standardization of care regarding cancer-treatment induced bone loss

    The Factuality Status of Chinese Necessity Modals. Exploring the Distribution Via Corpus-Based Approach

    Get PDF
    This paper is intended to test the deontic vs anankastic hypothesis outlined by Sparvoli 2012. The stipulation is that, in past contexts, deontic modals trigger a counterfactual inference, while anankastic modals (here called ‘goal-oriented modals’) either trigger an actuality entailment effects (‘only possibility’ modals) or a generic non-factual reading (‘mere necessity’ modals). The result of this corpus-based study conducted in a Chinese-English parallel corpus confirms the crucial role played by the deontic vs goal oriented contrast in the marking of factuality in Chinese and shows that the factuality value decreases across a cline from goal-oriented to deontic modals

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Statistical Modeling of Tropical Cyclone Climatology

    No full text
    Tropical cyclones (TCs) are among the most destructive natural hazards on the earth. The changing character of TC-related risks under climate change is of significant concern to both coastal and inland areas. In a risk assessment framework, a large number of synthetic TCs are often needed to evaluate the risk posed to a specific region. However, current TC statistical modeling methods that are used to generate storms often rely very little on the environment but heavily on storm characteristics. This dissertation aims to improve existing tropical cyclone statistical models by developing a new TC probabilistic model that is dependent on climate predictors so that it is suitable for climate change studies. Starting from TC intensity modeling, the dependence of TC intensification on the environment is firstly explored through various statistical modeling approaches. Mixture modeling is performed to capture the heterogeneity in TC intensification and potential physically-based environmental predictors are carefully examined. Based on these analyses, a hidden Markov model, the MeHiM (short for Markov environment-dependent hurricane intensity model), is developed to simulate tropical cyclone intensity evolution dependent on six essential environmental and storm predictors. The MeHiM is then coupled with a clustering-based genesis model and a data-driven track model to form a complete TC probabilistic model, PepC (Princeton environment-dependent probabilistic tropical cyclone model). All components of PepC are dependent on local environmental predictors. PepC is capable to generate large samples of synthetic TCs in an efficient way. The synthetic storms match well with observational records under the current climate condition. The PepC is applied to investigate climate change effects on TCs by generating large numbers of TCs under current and future climates, with environmental conditions taken from the Geophysical Fluid Dynamics Laboratory (GFDL) High-Resolution Forecast-Oriented Low Ocean Resolution (HiFLOR) model. Storms are shown to become more intense under future climate, however no significant change in TC frequency is detected by the end of 21st century. The intensity component of PepC, the MeHiM, is applied and examined for real-time forecasting. The performance is compared with the state-of-the-art TC statistical forecasting skill. The MeHiM shows great potential when coupled with an accurate rapid intensification indicator. Inspired by this finding, TC real-time satellite imagery data is used to predict the onset of rapid intensification using deep learning. Following these two example applications, PepC is potentially a powerful tool for TC risk assessment to inform strategic risk management and policy making

    Additional Northwestern Pacific Data for the Paper: An Environment-Dependent Probabilistic Tropical Cyclone Model

    No full text
    This data is the first version of PepC-generated data for Northwestern Pacific Basin; This is only a test version
    corecore