183 research outputs found

    Enriching Semantic Knowledge Bases for Opinion Mining in Big Data Applications

    Get PDF
    This paper presents a novel method for contextualizing and enriching large semantic knowledge bases for opinion mining with a focus on Web intelligence platforms and other high-throughput big data applications. The method is not only applicable to traditional sentiment lexicons, but also to more comprehensive, multi-dimensional affective resources such as SenticNet. It comprises the following steps: (i) identify ambiguous sentiment terms, (ii) provide context information extracted from a domain-specific training corpus, and (iii) ground this contextual information to structured background knowledge sources such as ConceptNet and WordNet. A quantitative evaluation shows a significant improvement when using an enriched version of SenticNet for polarity classification. Crowdsourced gold standard data in conjunction with a qualitative evaluation sheds light on the strengths and weaknesses of the concept grounding, and on the quality of the enrichment process

    Sentiment Analysis Using Common-Sense and Context Information

    Get PDF
    Sentiment analysis research has been increasing tremendously in recent times due to the wide range of business and social applications. Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In this paper, we propose a novel sentiment analysis model based on common-sense knowledge extracted from ConceptNet based ontology and context information. ConceptNet based ontology is used to determine the domain specific concepts which in turn produced the domain specific important features. Further, the polarities of the extracted concepts are determined using the contextual polarity lexicon which we developed by considering the context information of a word. Finally, semantic orientations of domain specific features of the review document are aggregated based on the importance of a feature with respect to the domain. The importance of the feature is determined by the depth of the feature in the ontology. Experimental results show the effectiveness of the proposed methods

    Scalable Knowledge Extraction and Visualization for Web Intelligence

    Get PDF
    Understanding stakeholder perceptions and assessing the impact of campaigns are key questions of communication experts. Web intelligence platforms help to answer such questions, provided that they are scalable enough to analyze and visualize information flows from volatile online sources in real time. This paper presents a distributed architecture for aggregating Web content repositories from Web sites and social media streams, memory-efficient methods to extract factual and affective knowledge, and interactive visualization techniques to explore the extracted knowledge. The presented examples stem from the Media Watch on Climate Change, a public Web portal that aggregates environmental content from a range of online sources

    Modeling Emotion Dynamics in Song Lyrics with State Space Models

    Get PDF
    Most previous work in music emotion recognition assumes a single or a few song-level labels for the whole song. While it is known that different emotions can vary in intensity within a song, annotated data for this setup is scarce and difficult to obtain. In this work, we propose a method to predict emotion dynamics in song lyrics without song-level supervision. We frame each song as a time series and employ a State Space Model (SSM), combining a sentence-level emotion predictor with an Expectation-Maximization (EM) procedure to generate the full emotion dynamics. Our experiments show that applying our method consistently improves the performance of sentence-level baselines without requiring any annotated songs, making it ideal for limited training data scenarios. Further analysis through case studies shows the benefits of our method while also indicating the limitations and pointing to future directions

    Analyzing the Public Discourse on Works of Fiction: Detection and Visualization of Emotion in Online Coverage about HBO's Game of Thrones

    Get PDF
    This paper presents a Web intelligence portal that captures and aggregates news and social media coverage about "Game of Thrones", an American drama television series created for the HBO television network based on George R.R. Martin's series of fantasy novels. The system collects content from the Web sites of Anglo-American news media as well as from four social media platforms: Twitter, Facebook, Google+ and YouTube. An interactive dashboard with trend charts and synchronized visual analytics components not only shows how often Game of Thrones events and characters are being mentioned by journalists and viewers, but also provides a real-time account of concepts that are being associated with the unfolding storyline and each new episode. Positive or negative sentiment is computed automatically, which sheds light on the perception of actors and new plot elements

    Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS

    Get PDF
    Suicide is the 10th leading cause of death in the U.S (1999-2019). However, predicting when someone will attempt suicide has been nearly impossible. In the modern world, many individuals suffering from mental illness seek emotional support and advice on well-known and easily-accessible social media platforms such as Reddit. While prior artificial intelligence research has demonstrated the ability to extract valuable information from social media on suicidal thoughts and behaviors, these efforts have not considered both severity and temporality of risk. The insights made possible by access to such data have enormous clinical potential - most dramatically envisioned as a trigger to employ timely and targeted interventions (i.e., voluntary and involuntary psychiatric hospitalization) to save lives. In this work, we address this knowledge gap by developing deep learning algorithms to assess suicide risk in terms of severity and temporality from Reddit data based on the Columbia Suicide Severity Rating Scale (C-SSRS). In particular, we employ two deep learning approaches: time-variant and time-invariant modeling, for user-level suicide risk assessment, and evaluate their performance against a clinician-adjudicated gold standard Reddit corpus annotated based on the C-SSRS. Our results suggest that the time-variant approach outperforms the time-invariant method in the assessment of suicide-related ideations and supportive behaviors (AUC:0.78), while the time-invariant model performed better in predicting suicide-related behaviors and suicide attempt (AUC:0.64). The proposed approach can be integrated with clinical diagnostic interviews for improving suicide risk assessments.Comment: 24 Pages, 8 Tables, 6 Figures; Accepted by PLoS One ; One of the two mentioned Datasets in the manuscript has Closed Access. We will make it public after PLoS One produces the manuscrip

    Visualizing Contextual Information in Aggregated Web Content Repositories

    Get PDF
    Understanding stakeholder perceptions and the impact of campaigns are key insights for communication experts and policy makers. A structured analysis of Web content can help answer these questions, particularly if this analysis involves the ability to extract, disambiguate and visualize contextual information. After summarizing methods used for acquiring and annotating Web content repositories, we present visualization techniques to explore the lexical, geospatial and relational context of entities in these repositories. The examples stem from the Media Watch on Climate Change, a publicly available Web portal that aggregates environmental resources from various online sources

    Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS

    Get PDF
    Suicide is the 10th leading cause of death in the U.S (1999-2019). However, predicting when someone will attempt suicide has been nearly impossible. In the modern world, many individuals suffering from mental illness seek emotional support and advice on well-known and easily-accessible social media platforms such as Reddit. While prior artificial intelligence research has demonstrated the ability to extract valuable information from social media on suicidal thoughts and behaviors, these efforts have not considered both severity and temporality of risk. The insights made possible by access to such data have enormous clinical potential - most dramatically envisioned as a trigger to employ timely and targeted interventions (i.e., voluntary and involuntary psychiatric hospitalization) to save lives. In this work, we address this knowledge gap by developing deep learning algorithms to assess suicide risk in terms of severity and temporality from Reddit data based on the Columbia Suicide Severity Rating Scale (C-SSRS). In particular, we employ two deep learning approaches: time-variant and time-invariant modeling, for user-level suicide risk assessment, and evaluate their performance against a clinician-adjudicated gold standard Reddit corpus annotated based on the C-SSRS. Our results suggest that the time-variant approach outperforms the time-invariant method in the assessment of suicide-related ideations and supportive behaviors (AUC:0.78), while the time-invariant model performed better in predicting suicide-related behaviors and suicide attempt (AUC:0.64). The proposed approach can be integrated with clinical diagnostic interviews for improving suicide risk assessments

    Extracting Knowledge from the Web and Social Media for Progress Monitoring in Public Outreach and Science Communication

    Get PDF
    Given the intense attention that environmental topics such as climate change attract in news and social media coverage, key questions for large science agencies such as the National Oceanic and Atmospheric Administration (NOAA) are how different stakeholders perceive the observable threats and policy options, how public media react to new scientific insights, and how journalists present climate science knowledge to the public. This paper investigates the potential of semantic technologies to address these questions. It introduces the NOAA Media Watch and presents a detailed case study of how the metrics and visualizations of the webLyzard Web intelligence platform are used to track information flows across online media channels. Building upon this platform, we present a novel framework to measure the impact of science communication and public outreach campaigns – through a combination of quantitative and visual methods that go beyond sentiment analysis and related opinion mining approaches

    Metadata Enriched Visualization of Keywords in Context

    Get PDF
    This paper presents an interactive, synchronized and metadata enriched implementation of the Word Tree metaphor, which is an interactive visualization technique to show Keywords-in-Context (KWIC). Embedded into a Web intelligence platform focusing on climate change coverage, it provides users with a tool to better understand the usage of terms in large document collections. One of the novelties is the implementation of filters for the Word Tree, which shifts the focus of attention directly onto significant phrases, instead of punctuation or fill-words, inherent to natural language usage
    • …
    corecore