383 research outputs found

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Analyzing Social Media Data using Sentiment Mining and Bi-gram Analysis for the Recommendation of YouTube Videos.

    Get PDF
    In this work we combine sentiment analysis with graph theory to analyze user posts, likes/dislikes on a variety of social media to provide recommendations for YouTube videos. We focus on the topic of climate change/global warming which has caused much alarm and controversy over recent years. Our intention is to recommend informative YouTube videos to those seeking a balanced viewpoint of this area and the key arguments/issues. To this end we analyze Twitter data; Reddit comments and posts; user comments, view statistics and likes/dislikes of YouTube videos. The combination of sentiment analysis with raw statistics and linking users with their posts gives deeper insights into their needs and quest for quality information. Sentiment analysis provides the insights into user likes and dislikes, graph theory provides the linkage patterns and relationships between users, posts and sentiment

    Investigating value propositions in social media: studies of brand and customer exchanges on Twitter

    Get PDF
    Social media presents one of the richest forums to investigate publicly explicit brand value propositions and its corresponding customer engagement. Seldom have researchers investigated the nature of value propositions available on social media and the insights that can be unearthed from available data. This work bridges this gap by studying the value propositions available on the Twitter platform. This thesis presents six different studies conducted to examine the nature of value propositions. The first study presents a value taxonomy comprising 15 value propositions that are identified in brand tweets. This taxonomy is tested for construct validity using a Delphi panel of 10 experts – 5 from information science and 5 from marketing. The second study demonstrates the utility of the taxonomy developed by identifying the 15 value propositions from brand tweets (nb=658) of the top-10 coffee brands using content analysis. The third study investigates the feedback provided by customers (nc=12077) for values propositioned by the top-10 coffee brands (for the 658 brand tweets). Also, it investigates which value propositions embedded in brand tweets attract ‘shallow’ vs. ‘deep’ engagement from customers. The fourth study is a replication of studies 2 and 3 for a different time-period. The data considered for studies 2 and 3 was for a 3-month period in 2015. In the fourth study, Twitter data for the same brands was analysed for a different (nb=290, nc=8811) 3-month period in 2018. This study thus examines the nature of change in value propositions across brands over time. The fifth study was on generalizability and replicates the investigation of brand and customer tweets (nb=635, nc=7035) in the market domain of the top-10 car brands in 2018. Lastly, study six conducted an evaluation of a software system called Value Analysis Toolkit (VAT) that was constructed based on the research findings in studies 1 - 5. This tool is targeted at researchers and practitioners who can use the tool to obtain value proposition-based insights from social media data (brand value propositions and the corresponding feedback from customers). The developed tool is evaluated for external validity using 35 students and 5 industry participants in three dimensions (tool’s analytics features, usability and usefulness). Overall, the contributions of this thesis are: a) a taxonomy to identify value propositions in Twitter (study 1) b) an approach to extract value proposition-based insights in brand tweets and the corresponding feedback from customers in the process of value co-creation (studies 2 - 5) for the top-10 coffee and car brands, and c) an operational tool (study 6) that can be used to analyse value propositions of various brands (e.g., compare value propositions of different brands), and identify which value propositions attract positive electronic word of mouth (eWOM). These value proposition-based insights can be used by social media managers to devise social-media strategies that are likely to stimulate positive discussions about a brand in social media

    Mining Twitter Sequences of Product Opinions with Multi-Word Aspect Terms

    Get PDF
    Social media platforms have opened doors to users\u27 opinions and perceptions. The text remains the most popular means of contact on social media, despite different means of communication (audio/video and images). Twitter is one such microblogging platform that allows people to express their thoughts within 280 characters per message. The freedom of expression has made it difficult to understand the polarity (Positive, Negative, or Neutral) of the tweets/posts. Given a corpus of microblog texts (e.g., the new iPhone battery life is good, but camera quality is bad ), mining aspects (e.g., battery life, camera quality) and opinions (e.g., good, bad) of these products are challenging due to the vast data being generated. Aspect-Based Opinion Mining (ABOM) is thus a combination of aspect extraction and opinion mining that allows an enterprise to analyze the data in detail, saving time and money automatically. Existing systems such as Hate Crime Twitter Sentiment (HCTS) and Microblog Aspect Miner (MAM) have been recently proposed to perform ABOM on Twitter. These systems generally go through the four-step approach of obtaining microblog posts, identifying frequent nouns (candidate aspects), pruning the candidate aspects, and getting opinion polarity. However, they differ in how well they prune their candidate features. HCTS uses Apriori based Association rule mining to find the important aspects (single and multi word) of a given product. However, the Apriori based system generate many candidate sequences which generates redundant candidate aspects and HCTS also fails to summarize the category of the aspects (Camera? Battery?). MAM follows the similar approach to that of HCTS for finding the relevant aspects but it further clusters the frequent nouns (aspects) to obtain the relevant aspects. However, it does not identify the multi-word aspects and the aspect category of a product. This thesis proposes a system called Microblog Aspect Sequence Miner (MASM) as an extension of Microblog Aspect Miner (MAM) by replacing the Apriori algorithm with the modified frequent sequential pattern mining algorithm. The system uses the power of sequential pattern mining for aspect extraction in ABOM. The sentiments of the tweets are unknown, so we build our approach in an unsupervised learning manner. The input posts are first classified to identify those tweets which contain the opinion (subjective) to those that do not have any opinion (objective). Then we extract the Parts of Speech tags for the explicit aspects to identify the frequent nouns. The novel frequent pattern mining framework (CM-SPAM) is applied to segment the single and multi-word aspects which generates less sequences as compared to previous approaches. This prior knowledge helps us to operate a topic modeling framework (Latent Dirichlet Allocation) to determine the summary of most common aspects (Aspect Category) and their sentiments for a product. Thefindings demonstrate that the MASM model has a promising performance in finding relevant aspects with reduction of average vector size (cost of candidate/aspect generation) against the MAM and HCTS using the Sanders Twitter corpus dataset. Experimental results with evaluation metrics of execution time, precision, recall, and F-measure indicate that our approach has higher recall and precision than the existing systems

    From Field to Film: Mosquito surveillance and survey of US adults\u27 knowledge and attitudes towards arthropod-borne disease vectors

    Get PDF
    Mosquito-borne disease is a public health challenge that warrants an active surveillance program for the identification of mosquito populations and the education of the public for prevention and protection against disease-transmitting arthropods. The communication of science to the public is necessary to prevent disease, change behavior, and promote a dialog between scientists and the public. People are accustomed to high quality entertainment, which begs the question, “If we made science more entertaining, would the public be more interested?” To address these issues, the objectives of this study are: 1) identify mosquito species and abundance at the US Meat Animal Research Center (MARC) in Clay County, Nebraska; 2) create two entertaining educational videos about arthropod disease vectors using puppets, song, and humor; 3) determine the effectiveness of these videos on the behavior and knowledge of adults in the US. The surveillance of mosquito species abundance at the MARC showed there to be a high number of mosquito species that can serve as vectors of disease. The educational videos were shown to be a successful form of science communication, and the videos developed and presented for this study were found to not only be entertaining, but significantly increased the participants’ engagement, knowledge, and behavior towards personal protection and management of mosquitoes and ticks. Advisor: Troy Anderso

    Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Sentiments in Sustainability Data Collection : Understanding User Sentiment in Collaborative, Social Tagging Environmental Sciences Platforms

    Get PDF
    Many scientists throughout the world are exploring how the data can be analyzed and used for sustainability purposes. With these investigations there is significant need to collect and explore the data with mobile collaborative tools to enable real time interactions among the interested users and the researchers. The principle objective of this study is to explore an application of sentiment analysis on the collected heterogeneous data in the form of text, images, and location details that were collected using the Geotagger application and observe and infer users’ reactions and opinions of the geographical data they collect about
    • 

    corecore