354 research outputs found

    Analisis Trending Topik untuk Percakapan Media Sosial dengan Menggunakan Topic Modelling Berbasis Algoritme LDA

    Get PDF
    Aplikasi WhatsApp merupakan salah satu aplikasi chatting yang sangat populer terutama di Indonesia. WhatsApp mempunyai data unik karena memiliki pola pesan dan topik yang beragam dan sangat cepat berubah, sehingga untuk mengidentifikasi suatu topik dari kumpulan pesan tersebut sangat sulit dan menghabiskan banyak waktu jika dilakukan secara manual. Salah satu cara untuk mendapatkan informasi tersirat dari media sosial tersebut yaitu dengan melakukan pemodelan topik. Penelitian ini dilakukan untuk menganalisis penerapan metode LDA (Latent Dirichlet Allocation) dalam mengidentifikasi topik apa saja yang sedang dibahas pada grup WhatsApp di Universitas Islam Majapahit serta melakukan eksperimen pemodelan topik dengan menambahkan atribut waktu dalam penyusunan dokumen. Penelitian ini menghasilkan model topic dan nilai evaluasi f-measure dari model topik berdasarkan uji coba yang dilakukan. Metode LDA dipilih untuk melakukan pemodelan topik dengan memanfaatkan library LDA pada python serta menerapkan standar text-preprocessing dan menambahkan slang words removal untuk menangani kata tidak baku dan singkatan pada chat logs. Pengujian model topik dilakukan dengan uji human in the loop menggunakan word instrusion task kepada pakar Bahasa Indonesia. Hasil evaluasi LDA didapatkan hasil percobaan terbaik dengan mengubah dokumen menjadi 10 menit dan menggabungkan dengan reply chat pada percakapan grup WhatsApp merupakan salah satu cara dalam meningkatkan hasil pemodelan topik menggunakan algoritma Latent Dirichlet Allocation (LDA), didapatkan nilai precision sebesar 0.9294, nilai recall sebesar 0.7900 dan nilai f-measure sebesar 0.8541

    Discovering a Domain Knowledge Representation for Image Grouping: Multimodal Data Modeling, Fusion, and Interactive Learning

    Get PDF
    In visually-oriented specialized medical domains such as dermatology and radiology, physicians explore interesting image cases from medical image repositories for comparative case studies to aid clinical diagnoses, educate medical trainees, and support medical research. However, general image classification and retrieval approaches fail in grouping medical images from the physicians\u27 viewpoint. This is because fully-automated learning techniques cannot yet bridge the gap between image features and domain-specific content for the absence of expert knowledge. Understanding how experts get information from medical images is therefore an important research topic. As a prior study, we conducted data elicitation experiments, where physicians were instructed to inspect each medical image towards a diagnosis while describing image content to a student seated nearby. Experts\u27 eye movements and their verbal descriptions of the image content were recorded to capture various aspects of expert image understanding. This dissertation aims at an intuitive approach to extracting expert knowledge, which is to find patterns in expert data elicited from image-based diagnoses. These patterns are useful to understand both the characteristics of the medical images and the experts\u27 cognitive reasoning processes. The transformation from the viewed raw image features to interpretation as domain-specific concepts requires experts\u27 domain knowledge and cognitive reasoning. This dissertation also approximates this transformation using a matrix factorization-based framework, which helps project multiple expert-derived data modalities to high-level abstractions. To combine additional expert interventions with computational processing capabilities, an interactive machine learning paradigm is developed to treat experts as an integral part of the learning process. Specifically, experts refine medical image groups presented by the learned model locally, to incrementally re-learn the model globally. This paradigm avoids the onerous expert annotations for model training, while aligning the learned model with experts\u27 sense-making

    Real-Time Topic and Sentiment Analysis in Human-Robot Conversation

    Get PDF
    Socially interactive robots, especially those designed for entertainment and companionship, must be able to hold conversations with users that feel natural and engaging for humans. Two important components of such conversations include adherence to the topic of conversation and inclusion of affective expressions. Most previous approaches have concentrated on topic detection or sentiment analysis alone, and approaches that attempt to address both are limited by domain and by type of reply. This thesis presents a new approach, implemented on a humanoid robot interface, that detects the topic and sentiment of a user’s utterances from text-transcribed speech. It also generates domain-independent, topically relevant verbal replies and appropriate positive and negative emotional expressions in real time. The front end of the system is a smartphone app that functions as the robot’s face. It displays emotionally expressive eyes, transcribes verbal input as text, and synthesizes spoken replies. The back end of the system is implemented on the robot’s onboard computer. It connects with the app via Bluetooth, receives and processes the transcribed input, and returns verbal replies and sentiment scores. The back end consists of a topic-detection subsystem and a sentiment-analysis subsystem. The topic-detection subsystem uses a Latent Semantic Indexing model of a conversation corpus, followed by a search in the online database ConceptNet 5, in order to generate a topically relevant reply. The sentiment-analysis subsystem disambiguates the input words, obtains their sentiment scores from SentiWordNet, and returns the averaged sum of the scores as the overall sentiment score. The system was hypothesized to engage users more with both subsystems working together than either subsystem alone, and each subsystem alone was hypothesized to engage users more than a random control. In computational evaluations, each subsystem performed weakly but positively. In user evaluations, users reported a higher level of topical relevance and emotional appropriateness in conversations in which the subsystems were working together, and they reported higher engagement especially in conversations in which the topic-detection system was working. It is concluded that the system partially fulfills its goals, and suggestions for future work are presented

    A study of topic and topic change in conversational threads

    Get PDF
    This thesis applies Latent Dirichlet Allocation (LDA) to the problem of topic and topic change in conversational threads using e-mail. We demonstrate that LDA can be used to successfully classify raw e-mail messages with threads to which they belong, and compare the results with those for processed threads, where quoted and reply text have been removed. Raw thread classification performs better, but processed threads show promise. We then present two new, unsupervised techniques for identifying topic change in e-mail. The first is a keyword clustering approach using LDA and DBSCAN to identify clusters of topics, and transition points between them. The second is a sliding window technique which assesses the current topic for every window, identifying transition points. The keyword clustering performs better than the sliding window approach. Both can be used as a baseline for future work.http://archive.org/details/astudyoftopicndt109454576NASA Ames Research Center author (civilian).Approved for public release; distribution is unlimited

    Wide-Scale Automatic Analysis of 20 Years of ITS Research

    Get PDF
    The analysis of literature within a research domain can provide significant value during preliminary research. While literature reviews may provide an in-depth understanding of current studies within an area, they are limited by the number of studies which they take into account. Importantly, whilst publications in hot areas abound, it is not feasible for an individual or team to analyse a large volume of publications within a reasonable amount of time. Additionally, major publications which have gained a large number of citations are more likely to be included in a review, with recent or fringe publications receiving less inclusion. We provide thus an automatic methodology for the large-scale analysis of literature within the Intelligent Tutoring Systems (ITS) domain, with the aim of identifying trends and areas of research from a corpus of publications which is significantly larger than is typically presented in conventional literature reviews. We illustrate this by a novel analysis of 20 years of ITS research. The resulting analysis indicates a significant shift of the status quo of research in recent years with the advent of novel neural network architectures and the introduction of MOOCs

    Mining scientific trends based on topics in conference call for papers

    Get PDF
    Ever since analyzing scientific topics and evolution of technology have become vital for researchers, academics, funding institutes and research administration departments, there is a crucial need to mine scientific trends to fill this appetite more rigorously. In this paper, we procured a novel Call for Papers (CFPs) dataset in order to analyze scientific evolution and prestige of conferences that set scientific trends using scientific publications indexed in DBLP. Using ACM CSS, 1.3 million publications that appear in 146 data mining conferences are mapped into different thematic areas by matching the terms that appear in publication titles with ACM CSS. In recent years, an attempt termed as Topic Detection and Tracking (TDT) [1] is made to find the solution for the problem of "well-awareness" on this dynamic data. As conference ranking has been made by different forums on the basis of mixed indicators1. ERA2 ranks Australia's higher education research institutions. The major contributions of this paper are as follows: (i) compilation of CFPs dataset, (ii) identification of topics and keywords from CFP corpus, and (iii) measure the impact of these extracted hot topics from CFPs

    Modeling Human Group Behavior In Virtual Worlds

    Get PDF
    Virtual worlds and massively-multiplayer online games are rich sources of information about large-scale teams and groups, offering the tantalizing possibility of harvesting data about group formation, social networks, and network evolution. They provide new outlets for human social interaction that differ from both face-to-face interactions and non-physically-embodied social networking tools such as Facebook and Twitter. We aim to study group dynamics in these virtual worlds by collecting and analyzing public conversational patterns of users grouped in close physical proximity. To do this, we created a set of tools for monitoring, partitioning, and analyzing unstructured conversations between changing groups of participants in Second Life, a massively multi-player online user-constructed environment that allows users to construct and inhabit their own 3D world. Although there are some cues in the dialog, determining social interactions from unstructured chat data alone is a difficult problem, since these environments lack many of the cues that facilitate natural language processing in other conversational settings and different types of social media. Public chat data often features players who speak simultaneously, use jargon and emoticons, and only erratically adhere to conversational norms. Humans are adept social animals capable of identifying friendship groups from a combination of linguistic cues and social network patterns. But what is more important, the content of what people say or their history of social interactions? Moreover, is it possible to identify whether iii people are part of a group with changing membership merely from general network properties, such as measures of centrality and latent communities? These are the questions that we aim to answer in this thesis. The contributions of this thesis include: 1) a link prediction algorithm for identifying friendship relationships from unstructured chat data 2) a method for identifying social groups based on the results of community detection and topic analysis. The output of these two algorithms (links and group membership) are useful for studying a variety of research questions about human behavior in virtual worlds. To demonstrate this we have performed a longitudinal analysis of human groups in different regions of the Second Life virtual world. We believe that studies performed with our tools in virtual worlds will be a useful stepping stone toward creating a rich computational model of human group dynamics

    A SYSTEMATIC REVIEW OF COMPUTATIONAL METHODS IN AND RESEARCH TAXONOMY OF HOMOPHILY IN INFORMATION SYSTEMS

    Get PDF
    Homophily is both a principle for social group formation with like-minded people as well as a mechanism for social interactions. Recent years have seen a growing body of management research on homophily particularly on large-scale social media and digital platforms. However, the predominant traditional qualitative and quantitative methods employed face validity issues and/or are not well-suited for big social data. There are scant guidelines for applying computational methods to specific research domains concerning descriptive patterns, explanatory mechanisms, or predictive indicators of homophily. To fill this research gap, this paper offers a structured review of the emerging literature on computational social science approaches to homophily with a particular emphasis on their relevance, appropriateness, and importance to information systems research. We derive a research taxonomy for homophily and offer methodological reflections and recommendations to help inform future research

    Behavioral Task Modeling for Entity Recommendation

    Get PDF
    Our everyday tasks involve interactions with a wide range of information. The information that we manage is often associated with a task context. However, current computer systems do not organize information in this way, do not help the user find information in task context, but require explicit user actions such as searching and information seeking. We explore the use of task context to guide the delivery of information to the user proactively, that is, to have the right information easily available at the right time. In this thesis, we used two types of novel contextual information: 24/7 behavioral recordings and spoken conversations for task modeling. The task context is created by monitoring the user's information behavior from temporal, social, and topical aspects; that can be contextualized by several entities such as applications, documents, people, time, and various keywords determining the task. By tracking the association amongst the entities, we can infer the user's task context, predict future information access, and proactively retrieve relevant information for the task at hand. The approach is validated with a series of field studies, in which altogether 47 participants voluntarily installed a screen monitoring system on their laptops 24/7 to collect available digital activities, and their spoken conversations were recorded. Different aspects of the data were considered to train the models. In the evaluation, we treated information sourced from several applications, spoken conversations, and various aspects of the data as different kinds of influence on the prediction performance. The combined influences of multiple data sources and aspects were also considered in the models. Our findings revealed that task information could be found in a variety of applications and spoken conversations. In addition, we found that task context models that consider behavioral information captured from the computer screen and spoken conversations could yield a promising improvement in recommendation quality compared to the conventional modeling approach that considered only pre-determined interaction logs, such as query logs or Web browsing history. We also showed how a task context model could support the users' work performance, reducing their effort in searching by ranking and suggesting relevant information. Our results and findings have direct implications for information personalization and recommendation systems that leverage contextual information to predict and proactively present personalized information to the user to improve the interaction experience with the computer systems.Jokapäiväisiin tehtäviimme kuuluu vuorovaikutusta monenlaisten tietojen kanssa. Hallitsemamme tiedot liittyvät usein johonkin tehtäväkontekstiin. Nykyiset tietokonejärjestelmät eivät kuitenkaan järjestä tietoja tällä tavalla tai auta käyttäjää löytämään tietoja tehtäväkontekstista, vaan vaativat käyttäjältä eksplisiittisiä toimia, kuten tietojen hakua ja etsimistä. Tutkimme, kuinka tehtäväkontekstia voidaan käyttää ohjaamaan tietojen toimittamista käyttäjälle ennakoivasti, eli siten, että oikeat tiedot olisivat helposti saatavilla oikeaan aikaan. Tässä väitöskirjassa käytimme kahdenlaisia uusia kontekstuaalisia tietoja: 24/7-käyttäytymistallenteita ja tehtävän mallintamiseen liittyviä puhuttuja keskusteluja. Tehtäväkonteksti luodaan seuraamalla käyttäjän tietokäyttäytymistä ajallisista, sosiaalisista ja ajankohtaisista näkökulmista katsoen; sitä voidaan kuvata useilla entiteeteillä, kuten sovelluksilla, asiakirjoilla, henkilöillä, ajalla ja erilaisilla tehtävää määrittävillä avainsanoilla. Tarkastelemalla näiden entiteettien välisiä yhteyksiä voimme päätellä käyttäjän tehtäväkontekstin, ennustaa tulevaa tiedon käyttöä ja hakea ennakoivasti käsillä olevaan tehtävään liittyviä asiaankuuluvia tietoja. Tätä lähestymistapaa arvioitiin kenttätutkimuksilla, joissa yhteensä 47 osallistujaa asensi vapaaehtoisesti kannettaviin tietokoneisiinsa näytönvalvontajärjestelmän, jolla voitiin 24/7 kerätä heidän saatavilla oleva digitaalinen toimintansa, ja joissa tallennettiin myös heidän puhutut keskustelunsa. Mallien kouluttamisessa otettiin huomioon datan eri piirteet. Arvioinnissa käsittelimme useista sovelluksista, puhutuista keskusteluista ja datan eri piirteistä saatuja tietoja erilaisina vaikutuksina ennusteiden toimivuuteen. Malleissa otettiin huomioon myös useiden tietolähteiden ja näkökohtien yhteisvaikutukset. Havaintomme paljastivat, että tehtävätietoja löytyi useista sovelluksista ja puhutuista keskusteluista. Lisäksi havaitsimme, että tehtäväkontekstimallit, joissa otetaan huomioon tietokoneen näytöltä ja puhutuista keskusteluista saadut käyttäytymistiedot, voivat parantaa suositusten laatua verrattuna tavanomaiseen mallinnustapaan, jossa tarkastellaan vain ennalta määritettyjä vuorovaikutuslokeja, kuten kyselylokeja tai verkonselaushistoriaa. Osoitimme myös, miten tehtäväkontekstimalli pystyi tukemaan käyttäjien suoritusta ja vähentämään heidän hakuihin tarvitsemaansa työpanosta järjestämällä hakutuloksia ja ehdottamalla heille asiaankuuluvia tietoja. Tuloksillamme ja havainnoillamme on suoria vaikutuksia tietojen personointi- ja suositusjärjestelmiin, jotka hyödyntävät kontekstuaalista tietoa ennustaakseen ja esittääkseen ennakoivasti personoituja tietoja käyttäjälle ja näin parantaakseen vuorovaikutuskokemusta tietokonejärjestelmien kanssa
    corecore