510 research outputs found

    A FRAMEWORK FOR ARABIC SENTIMENT ANALYSIS USING MACHINE LEARNING CLASSIFIERS

    Get PDF
    International audienceIn recent years, the use of Internet and online comments, expressed in natural language text, have increased significantly. However, it is difficult for humans to read all these comments and classify them appropriately. Consequently, an automatic approach is required to classify the unstructured data. In this paper, we propose a framework for Arabic language comprising of three steps: pre-processing, feature extraction and machine learning classification. The main aim of the proposed framework is to exploit the combination of different Arabic linguistic features. We evaluate the framework using two benchmark Arabic tweets datasets (ASTD, ATA), which enable sentiment polarity detection in general Arabic and Jordanian dialects. Comparative simulation results show that machine learning classifiers such as Support Vector Machine (SVM), Naive Bayes, MultiLayer Perceptron (MLP) and Logistic Regression-based produce the best performance by using a combination of n-gram features from Arabic tweets datasets. Finally, we evaluate the performance of our proposed framework using an Ensemble classifier approach, with promising results

    Policy labs in Europe: political innovation, structure and content analysis on Twitter

    Get PDF
    Recent years have seen a veritable boom in the creation of policy labs. These institution-based innvation laboratories aim to open up the processes of public policy design to the social stakeholders involved. In 2016, the European Union Policy Lab commissioned a report that identified 64 such laboratories in Europe. In the present study, we use network analysis to reveal the structure of the relationships between the 42 of these labs that have a presence on Twitter. We then conduct a content analysis of their tweets to identify the topics of interest. Our results describe a fragmented, country-based network and the principal concepts and key issues addressed by these institutions

    Measuring the Severity of Depression from Text using Graph Representation Learning

    Get PDF
    The common practice of psychology in measuring the severity of a patient's depressive symptoms is based on an interactive conversation between a clinician and the patient. In this dissertation, we focus on predicting a score representing the severity of depression from such a text. We first present a generic graph neural network (GNN) to automatically rate severity using patient transcripts. We also test a few sequence-based deep models in the same task. We then propose a novel form for node attributes within a GNN-based model that captures node-specific embedding for every word in the vocabulary. This provides a global representation of each node, coupled with node-level updates according to associations between words in a transcript. Furthermore, we evaluate the performance of our GNN-based model on a Twitter sentiment dataset to classify three different sentiments and on Alzheimer's data to differentiate Alzheimer’s disease from healthy individuals respectively. In addition to applying the GNN model to learn a prediction model from the text, we provide post-hoc explanations of the model's decisions for all three tasks using the model's gradients

    Análisis altmétrico de la investigación sobre trastornos en la calidad de sueño publicada en 2021

    Get PDF
    Introduction: sleep is a biological function of vital importance for most living beings. The number of published research articles related to sleep disturbances and sleep-wake rhythm is unprecedented and shows the intense efforts of the global research community to understand the different aspects of these pathologies and address them.Objective: to analyze the impact of research on sleep quality disorders published in 2021, based on the media, social and scientific attention received.Methods: an altmetric, observational, descriptive-retrospective and cross-sectional study was carried out, since the impact and use of research on effects on sleep quality published in 2021 on social and scientific platforms was analyzed through the use of altmetric indicators.     Results: of the 60 articles with the greatest altmetric attention, 50 were publications in journals, nine in preprint servers, and one monograph. Most of the online attention the posts received was on Twitter (1,685,152 total tweets). The journals that published the most influential articles related to the topic in question are classified in SJR Q3 with relatively high H indices.Conclusions: the research related to Sleep Quality Disorders that most predominated in this study were those published in scientific journals, with the most influential being those published in quartile three  journals. Most of the online attention received by these publications It was done on Twitter.Introducción: el sueño constituye una función biológica de vital importancia para la mayoría de los seres vivos. La cantidad de artículos de investigación publicados relacionados con las alteraciones del sueño y el ritmo sueño-vigilia, no tiene precedentes y muestra los intensos esfuerzos de la comunidad investigadora mundial para comprender los diferentes aspectos de estas patologías y abordarla.Objetivo: analizar el impacto de la investigación sobre trastornos en la calidad del sueño publicada en el año 2021, a partir de la atención mediática, social y científica recibida.Métodos: se realizó un estudio altmétrico, observacional de tipo descriptivo-retrospectivo y de corte transversal, pues se analizó el impacto y uso de investigación sobre afectaciones en la calidad de sueño publicada en 2021 en las plataformas sociales y científicas mediante el empleo de indicadores altmétricos.Resultados: de los 60 artículos con mayor atención altmétrica, 50 fueron publicaciones en revistas, nueve en servidores de preprint y una monografía. La mayor parte de la atención en línea que recibieron las publicaciones se realizó en Twitter (1,685,152 tweets en total). Las revistas que publicaron los artículos más influyentes relacionados con la temática en cuestión se encuentran clasificadas en el Q3 de SJR con índices H relativamente altos.Conclusiones: las investigaciones relacionadas con los Trastornos en la Calidad del Sueño que más predominaron en este estudio fueron aquellas publicadas en revistas científicas, siendo los más influyentes aquellos publicados en revistas de cuartil tres. La mayor parte de la atención en línea que recibieron estas publicaciones se realizó en Twitter

    #REVAL: a semantic evaluation framework for hashtag recommendation

    Full text link
    Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of evaluating hashtag similarities is inadequate as it ignores the semantic correlation between the recommended and ground truth hashtags. To tackle this problem, we propose a novel semantic evaluation framework for hashtag recommendation, called #REval. This framework includes an internal module referred to as BERTag, which automatically learns the hashtag embeddings. We investigate on how the #REval framework performs under different word embedding methods and different numbers of synonyms and hashtags in the recommendation using our proposed #REval-hit-ratio measure. Our experiments of the proposed framework on three large datasets show that #REval gave more meaningful hashtag synonyms for hashtag recommendation evaluation. Our analysis also highlights the sensitivity of the framework to the word embedding technique, with #REval based on BERTag more superior over #REval based on FastText and Word2Vec.Comment: 18 pages, 4 figure

    2019 SDSU Data Science Symposium Abstracts

    Get PDF

    Infodemiology and Infoveillance: Scoping Review

    Get PDF
    Background: Web-based sources are increasingly employed in the analysis, detection, and forecasting of diseases and epidemics, and in predicting human behavior toward several health topics. This use of the internet has come to be known as infodemiology, a concept introduced by Gunther Eysenbach. Infodemiology and infoveillance studies use web-based data and have become an integral part of health informatics research over the past decade. Objective: The aim of this paper is to provide a scoping review of the state-of-the-art in infodemiology along with the background and history of the concept, to identify sources and health categories and topics, to elaborate on the validity of the employed methods, and to discuss the gaps identified in current research. Methods: The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were followed to extract the publications that fall under the umbrella of infodemiology and infoveillance from the JMIR, PubMed, and Scopus databases. A total of 338 documents were extracted for assessment. Results: Of the 338 studies, the vast majority (n=282, 83.4%) were published with JMIR Publications. The Journal of Medical Internet Research features almost half of the publications (n=168, 49.7%), and JMIR Public Health and Surveillance has more than one-fifth of the examined studies (n=74, 21.9%). The interest in the subject has been increasing every year, with 2018 featuring more than one-fourth of the total publications (n=89, 26.3%), and the publications in 2017 and 2018 combined accounted for more than half (n=171, 50.6%) of the total number of publications in the last decade. The most popular source was Twitter with 45.0% (n=152), followed by Google with 24.6% (n=83), websites and platforms with 13.9% (n=47), blogs and forums with 10.1% (n=34), Facebook with 8.9% (n=30), and other search engines with 5.6% (n=19). As for the subjects examined, conditions and diseases with 17.2% (n=58) and epidemics and outbreaks with 15.7% (n=53) were the most popular categories identified in this review, followed by health care (n=39, 11.5%), drugs (n=40, 10.4%), and smoking and alcohol (n=29, 8.6%). Conclusions: The field of infodemiology is becoming increasingly popular, employing innovative methods and approaches for health assessment. The use of web-based sources, which provide us with information that would not be accessible otherwise and tackles the issues arising from the time-consuming traditional methods, shows that infodemiology plays an important role in health informatics research

    'This may be the most dangerous thing Donald Trump believes’: eugenic populism and the American body politic

    Get PDF
    The 2016 election of a self-declared eugenicist to the most powerful political role in the world signified a widespread and worrying forgetting of America’s eugenic past. This essay shows how America’s current president employs similar rhetorical and fictive devices to those employed by eugenicists and politicians in the 1920s and 1930s, strategies that he now uses to fuel his supremacist fantasies. By linking up Trump’s lifelong belief in his genetic superiority (and thereby the apparent “truth” of eugenics more broadly) with earlier eugenic beliefs of the 1920s and 1930s, this paper explores how, despite being scientifically discredited, eugenics steadfastly remained a popular ideological staple of American meritocratic and supremacist belief
    corecore