8 research outputs found

    Topic Discovery of Online Course Reviews Using LDA with Leveraging Reviews Helpfulness

    Get PDF
    Despite the popularity of the Massive Open Online Courses, small-scale research has been done to understand the factors that influence the teaching-learning process through the massive online platform. Using topic modeling approach, our results show terms with prior knowledge to understand e.g.: Chuck as the instructor name. So, we proposed the topic modeling approach on helpful subjective reviews. The results show five influential factors: “learn easy excellent class program”, “python learn class easy lot”, “Program learn easy python time game”, and “learn class python time game”. Also, research results showed that the proposed method improved the perplexity score on the LDA model

    FINDING TOPICS IN CREATIVE WRITING ON ENVIRONMENTAL PRESERVATION FOR BETTER TEACHING STRATEGIES: A CASE OF STUDY IN AN ELEMENTARY SCHOOL FROM COLOMBIA

    Get PDF
    In this research, essays on trees preservation of fourth grade students (elementary school from Colombia) were evaluated with Latent Dirichlet Allocation (LDA). The objective was extracting the fundamental topics, to understand the students’ behavior and awareness towards the environment from the creative writing. The computational results suggest the student’s reflections on environment preservation are focused on five main topics in: Teach-Learn to care for the environment, Explore-discover the environment, Well-being of the environment, Concern for the environment, and Restoration and conservation of the environment. This text analysis by LDA can complement the manual analysis of teachers, avoiding the veracity bias and allowing the enhancement of teaching strategies.En esta investigación, se evaluaron ensayos sobre la preservación de árboles de estudiantes de cuarto grado (escuela primaria de Colombia) con Latent Dirichlet Allocation (LDA). El objetivo fue extraer los temas fundamentales, para comprender el comportamiento y la conciencia de los estudiantes hacia el medio ambiente a partir de la escritura creativa. Los resultados computacionales sugieren que las reflexiones del estudiante sobre la preservación del medio ambiente se centran en cinco temas principales en: Enseñar-Aprender a cuidar el medio ambiente, Explorar-descubrir el medio ambiente, Bienestar del medio ambiente, Preocupación por el medio ambiente y Restauración y conservación del entorno. Este análisis de texto por LDA puede complementar el análisis manual de los docentes, evitando el sesgo de veracidad y permitiendo potenciar las estrategias de enseñanza

    Re-ranking words to improve interpretability of automatically generated topics

    Get PDF
    Topics models, such as LDA, are widely used in Natural Language Processing. Making their output interpretable is an important area of research with applications to areas such as the enhancement of exploratory search interfaces and the development of interpretable machine learning models. Conventionally, topics are represented by their n most probable words, however, these representations are often difficult for humans to interpret. This paper explores the re-ranking of topic words to generate more interpretable topic representations. A range of approaches are compared and evaluated in two experiments. The first uses crowdworkers to associate topics represented by different word rankings with related documents. The second experiment is an automatic approach based on a document retrieval task applied on multiple domains. Results in both experiments demonstrate that re-ranking words improves topic interpretability and that the most effective re-ranking schemes were those which combine information about the importance of words both within topics and their relative frequency in the entire corpus. In addition, close correlation between the results of the two evaluation approaches suggests that the automatic method proposed here could be used to evaluate re-ranking methods without the need for human judgements

    Unsupervised keyword extraction from microblog posts via hashtags

    Full text link
    © River Publishers. Nowadays, huge amounts of texts are being generated for social networking purposes on Web. Keyword extraction from such texts like microblog posts benefits many applications such as advertising, search, and content filtering. Unlike traditional web pages, a microblog post usually has some special social feature like a hashtag that is topical in nature and generated by users. Extracting keywords related to hashtags can reflect the intents of users and thus provides us better understanding on post content. In this paper, we propose a novel unsupervised keyword extraction approach for microblog posts by treating hashtags as topical indicators. Our approach consists of two hashtag enhanced algorithms. One is a topic model algorithm that infers topic distributions biased to hashtags on a collection of microblog posts. The words are ranked by their average topic probabilities. Our topic model algorithm can not only find the topics of a collection, but also extract hashtag-related keywords. The other is a random walk based algorithm. It first builds a word-post weighted graph by taking into account posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides the algorithm to extract keywords according to hashtag topics. Last, the final ranking score of a word is determined by the stationary probability after a number of iterations. We evaluate our proposed approach on a collection of real Chinese microblog posts. Experiments show that our approach is more effective in terms of precision than traditional approaches considering no hashtag. The result achieved by the combination of two algorithms performs even better than each individual algorithm

    If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills

    Full text link
    Interview chatbots engage users in a text-based conversation to draw out their views and opinions. It is, however, challenging to build effective interview chatbots that can handle user free-text responses to open-ended questions and deliver engaging user experience. As the first step, we are investigating the feasibility and effectiveness of using publicly available, practical AI technologies to build effective interview chatbots. To demonstrate feasibility, we built a prototype scoped to enable interview chatbots with a subset of active listening skills - the abilities to comprehend a user's input and respond properly. To evaluate the effectiveness of our prototype, we compared the performance of interview chatbots with or without active listening skills on four common interview topics in a live evaluation with 206 users. Our work presents practical design implications for building effective interview chatbots, hybrid chatbot platforms, and empathetic chatbots beyond interview tasks.Comment: Working draft. To appear in the ACM CHI Conference on Human Factors in Computing Systems (CHI 2020

    The statistics of topic modelling.

    Get PDF
    This research project aims to provide a clear and concise guide to latent dirichlet allocation which is a form of topic modelling. The aim is to help researchers who do not have a strong background in mathematics or statistics to feel comfortable with using topic modelling in their work. In order to achieve this, the thesis provides a step-by-step explanation of how topic modelling works. A range of tools that can be used to perform a topic model analysis are also described. The first chapter gives an explanation of how topic modelling, and (more specifically), latent dirichlet allocation works; it offers a very basic explanation and then provides an easy to follow mathematical explanation. The second chapter explains how to perform a topic model analysis; this is done through an explanation of each step used to run a topic model analysis, starting from the type of dataset through to the software packages available to use. The third section provides an example topic model analysis, based on the Philpapers dataset. The final section provides a discussion on the highlights of each chapter and areas for further research

    Learning domain-specific sentiment lexicons with applications to recommender systems

    Get PDF
    Search is now going beyond looking for factual information, and people wish to search for the opinions of others to help them in their own decision-making. Sentiment expressions or opinion expressions are used by users to express their opinion and embody important pieces of information, particularly in online commerce. The main problem that the present dissertation addresses is how to model text to find meaningful words that express a sentiment. In this context, I investigate the viability of automatically generating a sentiment lexicon for opinion retrieval and sentiment classification applications. For this research objective we propose to capture sentiment words that are derived from online users’ reviews. In this approach, we tackle a major challenge in sentiment analysis which is the detection of words that express subjective preference and domain-specific sentiment words such as jargon. To this aim we present a fully generative method that automatically learns a domain-specific lexicon and is fully independent of external sources. Sentiment lexicons can be applied in a broad set of applications, however popular recommendation algorithms have somehow been disconnected from sentiment analysis. Therefore, we present a study that explores the viability of applying sentiment analysis techniques to infer ratings in a recommendation algorithm. Furthermore, entities’ reputation is intrinsically associated with sentiment words that have a positive or negative relation with those entities. Hence, is provided a study that observes the viability of using a domain-specific lexicon to compute entities reputation. Finally, a recommendation system algorithm is improved with the use of sentiment-based ratings and entities reputation

    Discourse, Power Dynamics, and Risk Amplification in Disaster Risk Management in Canada

    Get PDF
    The domain of disaster risk management is rife with discursive contentions, whereby dominant discourses amplify the powers of risk actors to precipitate and reinforce political, economic, and environmental inequalities that predispose different sections of the population to unequal disaster risk vulnerabilities. This thesis identified important actors (government, risk experts, media, and NGOs) that shape the power dynamics in disaster risk management in Canada and explained their roles, influences, and the dimensions in which their powers negotiate each other through risk discourses. The patterns of these power dynamics in the three aspects of power –communication, assessment, and social trust –were also developed to provide a detailed description of how they form hegemonies that produce disaster inequality. The Power Amplified Risk Discourse (PARD) framework provides a theoretical framework for investigating the roles of discourses in creating and sustaining these power imbalances. PARD is an adaptation of the Social Amplification of Risk Framework (SARF) which can explain the complex cognitive, technical, and social dimensions to selective risk interpretations. Accordingly, PARD uses documentary and critical discourse analyses to investigate the roles of discourses in shaping the assessment and interpretation practices that reflect risk power imbalances. Analyses of the discursive and social practices also revealed that in many cases, these powers do not oppose each other, but rather work cooperatively to foist a risk hegemony as a means of self-perpetuation in risk management decision-making. The study also concludes that technical expertise, social trust, and privileged access to media constitute the biggest power factors for shaping risk discourse. Additionally, topic modeling and thematic analysis of social media data revealed the social impacts that could be directly attributed as the social consequences of these discursive power dynamics. The study suggests that the decentralized access to risk information and the growing distrust for institutional expertise significantly account for the social responses to power amplification in risk discourses. The study recommends a more inclusive approach to risk management and calls for restoration of trust between institutions and the public. Recommendations were also made for future research
    corecore