22 research outputs found
Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems
Intelligent personal assistant systems with either text-based or voice-based
conversational interfaces are becoming increasingly popular around the world.
Retrieval-based conversation models have the advantages of returning fluent and
informative responses. Most existing studies in this area are on open domain
"chit-chat" conversations or task / transaction oriented conversations. More
research is needed for information-seeking conversations. There is also a lack
of modeling external knowledge beyond the dialog utterances among current
conversational models. In this paper, we propose a learning framework on the
top of deep neural matching networks that leverages external knowledge for
response ranking in information-seeking conversation systems. We incorporate
external knowledge into deep neural models with pseudo-relevance feedback and
QA correspondence knowledge distillation. Extensive experiments with three
information-seeking conversation data sets including both open benchmarks and
commercial data show that, our methods outperform various baseline methods
including several deep text matching models and the state-of-the-art method on
response selection in multi-turn conversations. We also perform analysis over
different response types, model variations and ranking examples. Our models and
research findings provide new insights on how to utilize external knowledge
with deep neural models for response selection and have implications for the
design of the next generation of information-seeking conversation systems.Comment: Accepted by the 41th International ACM SIGIR Conference on Research
and Development in Information Retrieval (SIGIR 2018), Ann Arbor, Michigan,
U.S.A. July 8-12, 2018 (Full Oral Paper
Evaluating the Effectiveness of tutorial dialogue instruction in a Explotary learning context
[Proceedings of] ITS 2006, 8th International Conference on Intelligent Tutoring Systems, 26-30 June 2006, Jhongli, Taoyuan County, TaiwanIn this paper we evaluate the instructional effectiveness of tutorial dialogue agents in an exploratory learning setting. We hypothesize that the creative nature of an exploratory learning environment creates an opportunity for the benefits of tutorial dialogue to be more clearly evidenced than in previously published studies. In a previous study we showed an advantage for tutorial dialogue support in an exploratory learning environment where that support was administered by human tutors [9]. Here, using a similar experimental setup and materials, we evaluate the effectiveness of tutorial dialogue agents modeled after the human tutors from that study. The results from this study provide evidence of a significant learning benefit of the dialogue agentsThis project is supported by ONR Cognitive and Neural Sciences Division, Grant number N000140410107proceedingPublicad
Recommended from our members
Response Retrieval in Information-seeking Conversations
The increasing popularity of mobile Internet has led to several crucial changes in the way that people use search engines compared with traditional Web search on desktops. On one hand, there is limited output bandwidth with the small screen sizes of most mobile devices. Mobile Internet users prefer direct answers on the search engine result page (SERP). On the other hand, voice-based / text-based conversational interfaces are becoming increasing popular as shown in the wide adoption of intelligent assistant services and devices such as Amazon Echo, Microsoft Cortana and Google Assistant around the world. These important changes have triggered several new challenges that search engines have had to adapt to in order to better satisfy the information needs of mobile Internet users. In this dissertation, we investigate several aspects of single-turn answer retrieval and multi-turn information-seeking conversations to handle the new challenges of search on the mobile Internet.
We start from the research on single-turn answer retrieval and analyze the weaknesses of existing deep learning architectures for answer ranking. Then we propose an attention based neural matching model with a value-shared weighting scheme and attention mechanism to improve existing deep neural answer ranking models. Our proposed model achieves state-of-the-art performance for answer sentence retrieval compared with both feature engineering based methods and other neural models.
Then we move on to study response retrieval in multi-turn information-seeking conversations beyond single-turn interactions. Much research on response selection in conversation systems is modeling the matching patterns between user input message (either with context or not) and response candidates, which ignores external knowledge beyond the dialog utterances. We propose a learning framework on top of deep neural matching networks that leverages external knowledge with pseudo-relevance feedback and QA correspondence knowledge distillation for response retrieval. We also study how to integrate user intent modeling into neural ranking models to improve response retrieval performance. Finally, hybrid models of response retrieval and generation are investigated in order to combine the merits of these two different paradigms of conversation models.
Our goal is to develop effective learning models for answer retrieval and information-seeking conversations, in order to improve the effectiveness and user experience when accessing information with a touch screen interface or a conversational interface, as commonly adopted by millions of mobile Internet devices
StoryNet: A 5W1H-based knowledge graph to connect stories
Title from PDF of title page viewed January 19, 2022Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (page 149-164)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2021Stories are a powerful medium through which the human community has exchanged information since the dawn of the information age. They have taken multiple forms like articles, movies, books, plays, short films, magazines, mythologies, etc. With the ever-growing complexity of information representation, exchange, and interaction, it became highly important to find ways that convey the stories more effectively. With a world that is diverging more and more, it is harder to draw parallels and connect the information from all around the globe. Even though there have been efforts to consolidate the information on a large scale like Wikipedia, Wiki Data, etc, they are devoid of any real-time happenings. With the recent advances in Natural Language Processing (NLP), we propose a framework to connect these stories together making it easier to find the links between them thereby helping us understand and explore the links between the stories and possibilities that revolve around them.
Our framework is based on the 5W + 1H (What, Who, Where, When, Why, and How) format that represents stories in a format that is both easily understandable by humans and accurately generated by the deep learning models. We have used 311 calls and cyber security datasets as case studies for which a few NLP techniques like classification, Topic Modelling, Question Answering, and Question Generation were used along with the 5W1H framework to segregate the stories into clusters. This is a generic framework and can be used to apply to any field. We have evaluated two approaches for generating results - training-based and rule-based. For the rule-based approach, we used Stanford NLP parsers to identify patterns for the 5W + 1H terms, and for the training based approach, BERT embeddings were used and both were compared using an ensemble score (average of CoLA, SST-2, MRPC, QQP, STS-B, MNLI, QNLI, and RTE) along with BLEU and ROUGE scores. A few approaches are studied for training-based analysis - using BERT, Roberta, XLNet, ALBERT, ELECTRA, and AllenNLP Transformer QA with the datasets - CVE, NVD, SQuAD v1.1, and SQuAD v2.0, and compared them with custom annotations for identifying 5W + 1H. We've presented the performance and accuracy of both approaches in the results section. Our method gave a boost in the score from 30% (baseline) to 91% when trained on the 5W+1H annotations.Introduction -- Related work -- The 5W1H Framework and the models included -- StoryNet Application: Evaluation and Results -- Conclusion and Future Wor
Temporal Information in Data Science: An Integrated Framework and its Applications
Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems.Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems
On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters
This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p
MELex: a new lexicon for sentiment analysis in mining public opinion of Malaysia affordable housing projects
Sentiment analysis has the potential as an analytical tool to understand the preferences of the public. It has become one of the most active and progressively popular areas in information retrieval and text mining. However, in the Malaysia context, the sentiment analysis is still limited due to the lack of sentiment lexicon. Thus, the focus of this study is to a new lexicon and enhance the classification accuracy of sentiment analysis
in mining public opinion for Malaysia affordable housing project. The new lexicon for sentiment analysis is constructed by using a bilingual and domain-specific sentiment lexicon approach. A detailed review of existing approaches has been conducted and a new bilingual sentiment lexicon known as MELex (Malay-English Lexicon) has been generated. The developed approach is able to analyze text for two most widely used languages in Malaysia, Malay and English, with better accuracy. The process of constructing MELex involves three activities: seed words selection, polarity assignment and synonym expansions, with four different experiments have been implemented. It is evaluated based on the experimentation and case study approaches where PR1MA and PPAM are selected as case projects. Based on the comparative results over 2,230 testing data, the study reveals that the classification using MELex
outperforms the existing approaches with the accuracy achieved for PR1MA and PPAM projects are 90.02% and 89.17%, respectively. This indicates the capabilities of MELex in classifying public sentiment towards PRIMA and PPAM housing projects. The study has shown promising and better results in property domain as compared to the previous research. Hence, the lexicon-based approach implemented
in this study can reflect the reliability of the sentiment lexicon in classifying public sentiments
Learning to represent, categorise and rank in community question answering
The task of Question Answering (QA) is arguably one of the oldest tasks in Natural Language Processing, attracting high levels of interest from both industry and academia. However, most research has focused on factoid questions, e.g. Who is the president of Ireland? In contrast, research on answering non-factoid questions, such as manner, reason, difference and opinion questions, has been rather piecemeal.
This was largely due to the absence of available labelled data for the task. This is changing, however, with the growing popularity of Community Question Answering (CQA) websites, such as Quora, Yahoo! Answers and the Stack Exchange family of forums. These websites provide natural labelled data allowing us to apply machine learning techniques.
Most previous state-of-the-art approaches to the tasks of CQA-based question answering involved handcrafted features in combination with linear models. In this thesis we hypothesise that the use of handcrafted features can be avoided and the tasks can be approached with representation learning techniques, specifically deep learning.
In the first part of this thesis we give an overview of deep learning in natural language processing and empirically evaluate our hypothesis on the task of detecting semantically equivalent questions, i.e. predicting if two questions can be answered by the same answer.
In the second part of the thesis we address the task of answer ranking, i.e. determining how suitable an answer is for a given question. In order to determine the suitability of representation learning for the task of answer ranking, we provide a rigorous experimental evaluation of various neural architectures, based on feedforward, recurrent and convolutional neural networks, as well as their combinations.
This thesis shows that deep learning is a very suitable approach to CQA-based QA, achieving state-of-the-art results on the two tasks we addressed