19 research outputs found

    A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection

    Full text link
    Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is meaning-preserving, compositionality can be approximated as the semantic similarity between a phrase and a version of that phrase where words have been replaced by their synonyms. Different ways of representing such phrases exist (e.g., vectors [1] or language models [2]), and the choice of representation affects the measurement of semantic similarity. We propose a new compositionality detection method that represents phrases as ranked lists of term weights. Our method approximates the semantic similarity between two ranked list representations using a range of well-known distance and correlation metrics. In contrast to most state-of-the-art approaches in compositionality detection, our method is completely unsupervised. Experiments with a publicly available dataset of 1048 human-annotated phrases shows that, compared to strong supervised baselines, our approach provides superior measurement of compositionality using any of the distance and correlation metrics considered

    Pre-vaccination care-seeking in females reporting severe adverse reactions to HPV vaccine. A registry based case-control study

    Get PDF
    BACKGROUND:Since 2013 the number of suspected adverse reactions to the quadrivalent human papillomavirus (HPV) vaccine reported to the Danish Medicines Agency (DMA) has increased. Due to the resulting public concerns about vaccine safety, the coverage of HPV vaccinations in the childhood vaccination programme has declined. The aim of the present study was to determine health care-seeking prior to the first HPV vaccination among females who suspected adverse reactions to HPV vaccine. METHODS:In this registry-based case-control study, we included as cases vaccinated females with reports to the DMA of suspected severe adverse reactions. We selected controls without reports of adverse reactions from the Danish vaccination registry and matched by year of vaccination, age of vaccination, and municipality, and obtained from the Danish National Patient Registry and The National Health Insurance Service Register the history of health care usage two years prior to the first vaccine. We analysed the data by logistic regression while adjusting for the matching variables. RESULTS:The study included 316 cases who received first HPV vaccine between 2006 and 2014. Age range of cases was 11 to 52 years, with a peak at 12 years, corresponding to the recommended age at vaccination, and another peak at 19 to 28 years, corresponding to a catch-up programme targeting young women. Compared with 163,910 controls, cases had increased care-seeking in the two years before receiving the first HPV vaccine. A multivariable model showed higher use of telephone/email consultations (OR 1.9; 95% CI 1.2-3.2), physiotherapy (OR 2.1; 95% CI 1.6-2.8) and psychologist/psychiatrist (OR 1.9; 95% CI 1.3-2.7). Cases were more likely to have a diagnosis in the ICD-10 chapters of diseases of the digestive system (OR 1.6; 95% CI 1.0-2.4), of the musculoskeletal system (OR 1.6; 95% CI 1.1-2.2), symptoms or signs not classified elsewhere (OR 1.8; 95% CI 1.3-2.5) as well as injuries (OR 1.5; 95% CI 1.2-1.9). CONCLUSION:Before receiving the first HPV vaccination, females who suspected adverse reactions has symptoms and a health care-seeking pattern that is different from the matched population. Pre-vaccination morbidity should be taken into account in the evaluation of vaccine safety signals

    Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation

    Full text link
    Influenza-like illness (ILI) estimation from web search data is an important web analytics task. The basic idea is to use the frequencies of queries in web search logs that are correlated with past ILI activity as features when estimating current ILI activity. It has been noted that since influenza is seasonal, this approach can lead to spurious correlations with features/queries that also exhibit seasonality, but have no relationship with ILI. Spurious correlations can, in turn, degrade performance. To address this issue, we propose modeling the seasonal variation in ILI activity and selecting queries that are correlated with the residual of the seasonal model and the observed ILI signal. Experimental results show that re-ranking queries obtained by Google Correlate based on their correlation with the residual strongly favours ILI-related queries

    Time-Series Adaptive Estimation of Vaccination Uptake Using Web Search Queries

    Get PDF
    Estimating vaccination uptake is an integral part of ensuring public health. It was recently shown that vaccination uptake can be estimated automatically from web data, instead of slowly collected clinical records or population surveys. All prior work in this area assumes that features of vaccination uptake collected from the web are temporally regular. We present the first ever method to remove this assumption from vaccination uptake estimation: our method dynamically adapts to temporal fluctuations in time series web data used to estimate vaccination uptake. We show our method to outperform the state of the art compared to competitive baselines that use not only web data but also curated clinical data. This performance improvement is more pronounced for vaccines whose uptake has been irregular due to negative media attention (HPV-1 and HPV-2), problems in vaccine supply (DiTeKiPol), and targeted at children of 12 years old (whose vaccination is more irregular compared to younger children)

    Web Data Mining for Public Health Purposes

    No full text

    Ensemble learned vaccination uptake prediction using web search queries

    No full text
    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official vaccine records show that our method predicts vaccination uptake effectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields comparative performance. To our knowledge, this is the first study to predict vaccination uptake using web data (with and without clinical data)
    corecore