8,183 research outputs found
USING LONG SHORT-TERM MEMORY NETWORKS FOR NATURAL LANGUAGE PROCESSING
The problem of emotion classification is a complex and non-trivial task of language interpretation due to the natural language structure and its dynamic nature. The significance of the study is in covering the important issue of automatic processing of client feedbacks, collecting opinions and trend-catching. In this work, a number of existing solutions for emotion classification problem were considered, having their shortcomings and advantages illustrated. The evaluation of performance of the considered models was conducted on emotion classification on four emotion classes, namely Happy, Sad, Angry and Others. The model for emotion classification in three-sentence conversations was proposed in this work. The model is based on smileys and word embeddings with domain specificity in state of art conversations on the Internet. The importance of taking into account the information extracted from smileys as an additional data source of emotional coloring is investigated. The model performance is evaluated and compared with language processing model BERT (Bidirectional Encoder Representations from Transformers). The proposed model achieved better performance at classifying emotions comparing to BERT (having F1 score as 78 versus 75). It should be noted, that further study should be performed to enhance the processing by the model of mixed reviews represented by emotion class Others. However, modern performance of models for language representation and understanding did not achieve the human performance. There is a variety of factors to consider when choosing the word embeddings and training methods to design the model architecture
Netizens, Academicians, and Information Professionals' Opinions About AI With Special Reference To ChatGPT
This study aims to understand the perceptions and opinions of academicians towards ChatGPT-3 by collecting and analyzing social media comments, and a survey was conducted with library and information science professionals. The research uses a content analysis method and finds that while ChatGPT-3 can be a valuable tool for research and writing, it is not 100% accurate and should be cross-checked. The study also finds that while some academicians may not accept ChatGPT-3, most are starting to accept it. The study is beneficial for academicians, content developers, and librarians
CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society
The rapid advancement of conversational and chat-based language models has
led to remarkable progress in complex task-solving. However, their success
heavily relies on human input to guide the conversation, which can be
challenging and time-consuming. This paper explores the potential of building
scalable techniques to facilitate autonomous cooperation among communicative
agents and provide insight into their "cognitive" processes. To address the
challenges of achieving autonomous cooperation, we propose a novel
communicative agent framework named role-playing. Our approach involves using
inception prompting to guide chat agents toward task completion while
maintaining consistency with human intentions. We showcase how role-playing can
be used to generate conversational data for studying the behaviors and
capabilities of chat agents, providing a valuable resource for investigating
conversational language models. Our contributions include introducing a novel
communicative agent framework, offering a scalable approach for studying the
cooperative behaviors and capabilities of multi-agent systems, and
open-sourcing our library to support research on communicative agents and
beyond. The GitHub repository of this project is made publicly available on:
https://github.com/lightaime/camel
Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse
This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses.
This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups.
In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in usersâ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018â6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
Guiding Large Language Models via Directional Stimulus Prompting
We introduce a new framework, Directional Stimulus Prompting, that uses a
tuneable language model (LM) to provide guidance for the black-box frozen large
language model (LLM) on downstream tasks. Unlike prior work that manually or
automatically finds the optimal prompt for each task, we train a policy LM to
generate discrete tokens as directional stimulus of each input, which is a
hint/cue such as keywords of an article for summarization. The directional
stimulus is then combined with the original input and fed into the LLM to guide
its generation toward the desired target. The policy LM can be trained through
1) supervised learning from annotated data and 2) reinforcement learning from
offline and online rewards to explore directional stimulus that better aligns
LLMs with human preferences. This framework is flexibly applicable to various
LMs and tasks. To verify its effectiveness, we apply our framework to
summarization and dialogue response generation tasks. Experimental results
demonstrate that it can significantly improve LLMs' performance with a small
collection of training data: a T5 (780M) trained with 2,000 samples from the
CNN/Daily Mail dataset improves Codex (175B)'s performance by 9.0% in ROUGE-Avg
scores; only 80 dialogues can boost the combined score by 39.7%, achieving
comparable or even better performance than some fully trained models on the
MultiWOZ dataset. We have made our code publicly available.Comment: The code and data are available at
https://github.com/Leezekun/Directional-Stimulus-Promptin
How would Stance Detection Techniques Evolve after the Launch of ChatGPT?
Stance detection refers to the task of extracting the standpoint (Favor,
Against or Neither) towards a target in given texts. Such research gains
increasing attention with the proliferation of social media contents. The
conventional framework of handling stance detection is converting it into text
classification tasks. Deep learning models have already replaced rule-based
models and traditional machine learning models in solving such problems.
Current deep neural networks are facing two main challenges which are
insufficient labeled data and information in social media posts and the
unexplainable nature of deep learning models. A new pre-trained language model
chatGPT was launched on Nov 30, 2022. For the stance detection tasks, our
experiments show that ChatGPT can achieve SOTA or similar performance for
commonly used datasets including SemEval-2016 and P-Stance. At the same time,
ChatGPT can provide explanation for its own prediction, which is beyond the
capability of any existing model. The explanations for the cases it cannot
provide classification results are especially useful. ChatGPT has the potential
to be the best AI model for stance detection tasks in NLP, or at least change
the research paradigm of this field. ChatGPT also opens up the possibility of
building explanatory AI for stance detection
Enlightened Participation: SME Perspectives about Net Zero on Social Media using the Action Case Approach
Aims/Objectives This study aims to examine a linked future for a Net Zero global economy. Such a future is examined through network-driven change and informed by co-action and shared business management practices. Methodology used in the study We employ an action case (AC) approach to understand the impact of national and worldwide Net Zero policy for small and medium-sized enterprises (SME). We drew upon a qualitative survey with SMEs alongside a social network analysis (SNA) of Twitter data. Findings We discovered a substantial predictive effect of policy support in the SME social media material regarding Net Zero attitudes. Our findings indicate that reinforcing messages on policy support and assisting enterprises in adopting the new objectives may considerably enhance Net Zero accountability and serve as the foundation for an intervention strategy in policy-focused programmes for SMEs.Output Status: Forthcomin
From RSSE to BotSE: Potentials and Challenges Revisited after 15 Years
Both recommender systems and bots should proactively and smartly answer the
questions of software developers or other project stakeholders to assist them
in performing their tasks more efficiently. This paper reflects on the
achievements from the more mature area of Recommendation Systems in Software
Engineering (RSSE) as well as the rising area of Bots in Software Engineering
(BotSE). We discuss the similarities and differences, briefly review current
state of the art, and highlight three particular areas, in which the full
potential is yet to be tapped: a more socio-technical context awareness,
assisting knowledge sharing in addition to knowledge access, as well as
covering repetitive or stimulative scenarios related to requirements and
user-developer interaction
A Survey on Data-Driven Evaluation of Competencies and Capabilities Across Multimedia Environments
The rapid evolution of technology directly impacts the skills and jobs needed in the next decade. Users can, intentionally or unintentionally, develop different skills by creating, interacting with, and consuming the content from online environments and portals where informal learning can emerge. These environments generate large amounts of data; therefore, big data can have a significant impact on education. Moreover, the educational landscape has been shifting from a focus on contents to a focus on competencies and capabilities that will prepare our society for an unknown future during the 21st century. Therefore, the main goal of this literature survey is to examine diverse technology-mediated environments that can generate rich data sets through the usersâ interaction and where data can be used to explicitly or implicitly perform a data-driven evaluation of different competencies and capabilities. We thoroughly and comprehensively surveyed the state of the art to identify and analyse digital environments, the data they are producing and the capabilities they can measure and/or develop. Our survey revealed four key multimedia environments that include sites for content sharing & consumption, video games, online learning and social networks that fulfilled our goal. Moreover, different methods were used to measure a large array of diverse capabilities such as expertise, language proficiency and soft skills. Our results prove the potential of the data from diverse digital environments to support the development of lifelong and lifewide 21st-century capabilities for the future society
- âŠ