3 research outputs found

    Social Media Based Algorithmic Clinical Decision Support Learning from Behavioral Predispositions

    Get PDF
    Behavioral disorders are disabilities characterized by an individual’s mood, thinking, and social interactions. The commonality of behavioral disorders amongst the United States population has increased in the last few years, with an estimated 50% of all Americans diagnosed with a behavioral disorder at some point in their lifetime. AttentionDeficit/Hyperactivity Disorder is one such behavioral disorder that is a severe public health concern because of its high prevalence, incurable nature, significant impact on domestic life, and peer relationships. Symptomatically, in theory, ADHD is characterized by inattention, hyperactivity, and impulsivity. Access to providers who can offer diagnosis and treat the disorder varies by location. The ever-increasing use of social media can be effectively employed in the diagnosis and treatment of the disorder. Study of behavior and in extension, the study of individuals with behavioral disorders is made easier through the uninhibited setting in which posts are created on social media platforms. Outside the United States, diagnosis rates of the disorder are low, as it is mainly considered to be an American disorder. This impression was reinforced by the perception that the disorder is caused by social and cultural factors common to American society. However, in reality, the disorder can as quickly affect people of different races and cultures worldwide, but recognition of the disorder in the medical community has been slow. This may be due to its adverse impact on an individual, their families, and society. This dissertation focuses on providing clinicians with a clinical decision support system to overcome the societal stigma associated with the disorder and to ensure the accurate and efficient diagnosis of individuals with the disorder. The results provided in this dissertation assist in the diagnosis of individuals with Attention Deficit Hyperactivity Disorder. Data for individuals with the disorder is collected through posts of self-reported diagnoses on Twitter using the Twitter API. Previous research has proved that there are differences in behavior before and after the diagnosis of the disorder. To capitalize on this, symptomatic differences of the disease before and after diagnosis are discovered and evaluated. The symptoms of the disorder, namely, inattention, hyperactivity, and impulsivity, are quantified using measures of sentiment and semantics. A separate group of users without the disorder, the control group, are collected for validation. The analysis poses a three-class classification problem, with the classes being pre-diagnosed, postdiagnosed, and control groups. Decision trees are used to force all possible outcomes in the semantic and sentiment differences in the three classes of users to create a clear delineation. Behavioral disorders diagnosed by a clinician are based on identifying whether a patient deviates from an identified normal. This is evaluated by answering a set list of questions that quantify behavior. To achieve the same without manual intervention, ease in interpretability - decision trees are chosen. Classification using a decision tree is on a tweetlevel and a user-level. Four cases are used both analyses: pre-diagnosed vs. post-diagnosed group, pre-diagnosed vs. control group, post-diagnosed vs. control group, and prediagnosed vs. post-diagnosed vs. control group. The analysis on a user-level provides a higher degree of accuracy, with 93% accuracy for the case post-diagnosed vs. control group. The accuracy of the cases identifies the number of people who can be correctly classified into their respective groups. Low accuracy for the tweet-level results fortifies the opinion that the sparsity of information in tweet level analysis is a disadvantage. This is overcome by analyzing on a user level. The accuracy of the classifier can be further improved upon by the addition of features such as age and gender. The addition of these features may also be useful in predicting time to remission and peak of the disorder in future studies

    Semantic Lexicon Induction from Twitter with Pattern Relatedness and Flexible Term Length

    No full text
    With the rise of social media, learning from informal text has become increasingly important. We present a novel semantic lexicon induction approach that is able to learn new vocabulary from social media. Our method is robust to the idiosyncrasies of informal and open-domain text corpora. Unlike previous work, it does not impose restrictions on the lexical features of candidate terms — e.g. by restricting entries to nouns or noun phrases —while still being able to accurately learn multiword phrases of variable length. Starting with a few seed terms for a semantic category, our method first explores the context around seed terms in a corpus, and identifies context patterns that are relevant to the category. These patterns are used to extract candidate terms — i.e. multiword segments that are further analyzed to ensure meaningful term boundary segmentation. We show that our approach is able to learn high quality semantic lexicons from informally written social media text of Twitter, and can achieve accuracy as high as 92% in the top 100 learned category members

    Doctor of Philosophy in Computer Science

    Get PDF
    dissertationOver the last decade, social media has emerged as a revolutionary platform for informal communication and social interactions among people. Publicly expressing thoughts, opinions, and feelings is one of the key characteristics of social media. In this dissertation, I present research on automatically acquiring knowledge from social media that can be used to recognize people's affective state (i.e., what someone feels at a given time) in text. This research addresses two types of affective knowledge: 1) hashtag indicators of emotion consisting of emotion hashtags and emotion hashtag patterns, and 2) affective understanding of similes (a form of figurative comparison). My research introduces a bootstrapped learning algorithm for learning hashtag in- dicators of emotions from tweets with respect to five emotion categories: Affection, Anger/Rage, Fear/Anxiety, Joy, and Sadness/Disappointment. With a few seed emotion hashtags per emotion category, the bootstrapping algorithm iteratively learns new hashtags and more generalized hashtag patterns by analyzing emotion in tweets that contain these indicators. Emotion phrases are also harvested from the learned indicators to train additional classifiers that use the surrounding word context of the phrases as features. This is the first work to learn hashtag indicators of emotions. My research also presents a supervised classification method for classifying affective polarity of similes in Twitter. Using lexical, semantic, and sentiment properties of different simile components as features, supervised classifiers are trained to classify a simile into a positive or negative affective polarity class. The property of comparison is also fundamental to the affective understanding of similes. My research introduces a novel framework for inferring implicit properties that 1) uses syntactic constructions, statistical association, dictionary definitions and word embedding vector similarity to generate and rank candidate properties, 2) re-ranks the top properties using influence from multiple simile components, and 3) aggregates the ranks of each property from different methods to create a final ranked list of properties. The inferred properties are used to derive additional features for the supervised classifiers to further improve affective polarity recognition. Experimental results show substantial improvements in affective understanding of similes over the use of existing sentiment resources
    corecore