22,066 research outputs found
Recommended from our members
Detecting Important Life Events on Twitter Using Frequent Semantic and Syntactic Subgraphs
Identifying global events from social media has been the focus of much research in recent years. However, the identification of personal life events poses new requirements and challenges that have received relatively little research attention. In this paper we explore a new approach for life event identification, where we expand social media posts into both semantic, and syntactic networks of content. Frequent graph patterns are mined from these networks and used as features to enrich life-event classifiers. Results show that our approach significantly outperforms the best performing baseline in accuracy (by 4.48% points) and F-measure (by 4.54% points) when used to identify five major life events identified from the psychology literature: Getting Married, Having Children, Death of a Parent, Starting School, and Falling in Love. In addition, our results show that, while semantic graphs are effective at discriminating the theme of the post (e.g. the topic of marriage), syntactic graphs help identify whether the post describes a personal event (e.g. someone getting married)
SeLeCT: a lexical cohesion based news story segmentation system
In this paper we compare the performance of three distinct approaches to lexical cohesion based text segmentation. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e., distinct news stories from broadcast news programmes. Our approach to news story segmentation (the SeLeCT system) is based on an analysis of lexical cohesive strength between textual units using a linguistic technique called lexical chaining. We evaluate the relative performance of SeLeCT with respect to two other cohesion based segmenters: TextTiling and C99. Using a recently introduced evaluation metric WindowDiff, we contrast the segmentation accuracy of each system on both "spoken" (CNN news transcripts) and "written" (Reuters newswire) news story test sets extracted from the TDT1 corpus
Statistical keyword detection in literary corpora
Understanding the complexity of human language requires an appropriate
analysis of the statistical distribution of words in texts. We consider the
information retrieval problem of detecting and ranking the relevant words of a
text by means of statistical information referring to the "spatial" use of the
words. Shannon's entropy of information is used as a tool for automatic keyword
extraction. By using The Origin of Species by Charles Darwin as a
representative text sample, we show the performance of our detector and compare
it with another proposals in the literature. The random shuffled text receives
special attention as a tool for calibrating the ranking indices.Comment: Published version. 11 pages, 7 figures. SVJour for LaTeX2
Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
With the rise of social media, millions of people are routinely expressing
their moods, feelings, and daily struggles with mental health issues on social
media platforms like Twitter. Unlike traditional observational cohort studies
conducted through questionnaires and self-reported surveys, we explore the
reliable detection of clinical depression from tweets obtained unobtrusively.
Based on the analysis of tweets crawled from users with self-reported
depressive symptoms in their Twitter profiles, we demonstrate the potential for
detecting clinical depression symptoms which emulate the PHQ-9 questionnaire
clinicians use today. Our study uses a semi-supervised statistical model to
evaluate how the duration of these symptoms and their expression on Twitter (in
terms of word usage patterns and topical preferences) align with the medical
findings reported via the PHQ-9. Our proactive and automatic screening tool is
able to identify clinical depressive symptoms with an accuracy of 68% and
precision of 72%.Comment: 8 pages, Advances in Social Networks Analysis and Mining (ASONAM),
2017 IEEE/ACM International Conferenc
- …