5 research outputs found

    Analysing and Visualizing Tweets for U.S. President Popularity

    Get PDF
    In our society we are continually invested by a stream of information (opinions, preferences, comments, etc.). This shows how Twitter users react to news or events that they attend or take part in real time and with interest. In this context it becomes essential to have the appropriate tools in order to be able to analyze and extract data and information hidden in their large number of tweets. Social networks are a source of information with no rivals in terms of amount and variety of information that can be extracted from them. We propose an approach to analyze, with the help of automated tools, comments and opinions taken from social media in a real time environment. We developed a software system in R based on the Bayesian approach for text categorization. We aim of identifying sentiments expressed by the tweets posted on the Twitter social platform. The analysis of sentiment spread on social networks allows to identify the free thoughts, expressed authentically. In particular, we analyze the sentiments related to U.S President popularity by also visualizing tweets on a map. This allows to make an additional analysis of the real time reactions of people by associating the reaction of the single person who posted the tweet to his real time position in Unites States. In particular, we provide a visualization based on the geographical analysis of the sentiments of the users who posted the tweets

    Analysing and Visualizing Tweets for U.S. President Popularity

    Get PDF
    In our society we are continually invested by a stream of information (opinions, preferences, comments, etc.). This shows how Twitter users react to news or events that they attend or take part in real time and with interest. In this context it becomes essential to have the appropriate tools in order to be able to analyze and extract data and information hidden in their large number of tweets. Social networks are a source of information with no rivals in terms of amount and variety of information that can be extracted from them. We propose an approach to analyze, with the help of automated tools, comments and opinions taken from social media in a real time environment. We developed a software system in R based on the Bayesian approach for text categorization. We aim of identifying sentiments expressed by the tweets posted on the Twitter social platform. The analysis of sentiment spread on social networks allows to identify the free thoughts, expressed authentically. In particular, we analyze the sentiments related to U.S President popularity by also visualizing tweets on a map. This allows to make an additional analysis of the real time reactions of people by associating the reaction of the single person who posted the tweet to his real time position in Unites States. In particular, we provide a visualization based on the geographical analysis of the sentiments of the users who posted the tweets

    A COMPARATIVE STUDY ON ONTOLOGY GENERATION AND TEXT CLUSTERING USING VSM, LSI, AND DOCUMENT ONTOLOGY MODELS

    Get PDF
    Although using ontologies to assist information retrieval and text document processing has recently attracted more and more attention, existing ontology-based approaches have not shown advantages over the traditional keywords-based Latent Semantic Indexing (LSI) method. This paper proposes an algorithm to extract a concept forest (CF) from a document with the assistance of a natural language ontology, the WordNet lexical database. Using concept forests to represent the semantics of text documents, the semantic similarities of these documents are then measured as the commonalities of their concept forests. Performance studies of text document clustering based on different document similarity measurement methods show that the CF-based similarity measurement is an effective alternative to the existing keywords-based methods. Especially, this CF-based approach has obvious advantages over the existing keywords-based methods, including LSI, in dealing with text abstract databases, such as MEDLINE, or in P2P environments where it is impractical to collect the entire document corpus for analysis

    CREATING A BIOMEDICAL ONTOLOGY INDEXED SEARCH ENGINE TO IMPROVE THE SEMANTIC RELEVANCE OF RETREIVED MEDICAL TEXT

    Get PDF
    Medical Subject Headings (MeSH) is a controlled vocabulary used by the National Library of Medicine to index medical articles, abstracts, and journals contained within the MEDLINE database. Although MeSH imposes uniformity and consistency in the indexing process, it has been proven that using MeSH indices only result in a small increase in precision over free-text indexing. Moreover, studies have shown that the use of controlled vocabularies in the indexing process is not an effective method to increase semantic relevance in information retrieval. To address the need for semantic relevance, we present an ontology-based information retrieval system for the MEDLINE collection that result in a 37.5% increase in precision when compared to free-text indexing systems. The presented system focuses on the ontology to: provide an alternative to text-representation for medical articles, finding relationships among co-occurring terms in abstracts, and to index terms that appear in text as well as discovered relationships. The presented system is then compared to existing MeSH and Free-Text information retrieval systems. This dissertation provides a proof-of-concept for an online retrieval system capable of providing increased semantic relevance when searching through medical abstracts in MEDLINE

    Using Clustering Methods to Improve Ontology-Based Query Term Disambiguation

    No full text
    In this article we describe results of our research on the disambiguation of user queries using ontologies for categorization. We present an approach to cluster search results by using classes or "Sense Folders" (prototype categories) derived from the concepts of an assigned ontology, in our case WordNet. Using the semantic relations provided from such a resource, we can assign categories to prior, not annotated documents. The disambiguation of query terms in documents with respect to a user-specific ontology is an important issue in order to improve the retrieval performance for the user. Furthermore, we show that a clustering process can enhance the semantic classification of documents, and we discuss how this clustering process can be further enhanced using only the most descriptive classes of the ontology
    corecore