14,963 research outputs found

    Combining information seeking services into a meta supply chain of facts

    Get PDF
    The World Wide Web has become a vital supplier of information that allows organizations to carry on such tasks as business intelligence, security monitoring, and risk assessments. Having a quick and reliable supply of correct facts from perspective is often mission critical. By following design science guidelines, we have explored ways to recombine facts from multiple sources, each with possibly different levels of responsiveness and accuracy, into one robust supply chain. Inspired by prior research on keyword-based meta-search engines (e.g., metacrawler.com), we have adapted the existing question answering algorithms for the task of analysis and triangulation of facts. We present a first prototype for a meta approach to fact seeking. Our meta engine sends a user's question to several fact seeking services that are publicly available on the Web (e.g., ask.com, brainboost.com, answerbus.com, NSIR, etc.) and analyzes the returned results jointly to identify and present to the user those that are most likely to be factually correct. The results of our evaluation on the standard test sets widely used in prior research support the evidence for the following: 1) the value-added of the meta approach: its performance surpasses the performance of each supplier, 2) the importance of using fact seeking services as suppliers to the meta engine rather than keyword driven search portals, and 3) the resilience of the meta approach: eliminating a single service does not noticeably impact the overall performance. We show that these properties make the meta-approach a more reliable supplier of facts than any of the currently available stand-alone services

    An embodied conversational agent for intelligent web interaction on pandemic crisis communication

    Get PDF
    In times of crisis, an effective communication mechanism is paramount in providing accurate and timely information to the community. In this paper we study the use of an intelligent embodied conversational agent (EGA) as the front end interface with the public for a Crisis Communication Network Portal (CCNet). The proposed system, CCNet, is an integration of the intelligent conversation agent, AINI, and an Automated Knowledge Extraction Agent (AKEA). AKEA retrieves first hand information from relevant sources such as government departments and news channels. In this paper, we compare the interaction of AINI against two popular search engines, two question answering systems and two conversational systems

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability

    ‘MyQuestion’ Inquiry System for UTP Students and Staff

    Get PDF
    Due to the lack of single comprehensive system, both students and staff are not satisfied with the current state of handling inquiries within administrative departments which is time-consuming and human-relied. As students struggle to get timely response to their inquiries in administrative departments, the necessity for automated question answering system becomes more important. There is a need in a system that allows a student to ask a question in everyday language and receive an answer quickly and succinctly, with sufficient context to validate the answer. Hence, the main objective of the project is to develop system prototype based on the concepts of Automated Email Response and Question-Answering Systems using Iterative Development approach. The given paper further elaborates the problem statement and scope of the project. In-depth analyses and development requirements have been carried out in order to better facilitate to the progress of building the proposed system

    Large Scale Question Paraphrase Retrieval with Smoothed Deep Metric Learning

    Full text link
    The goal of a Question Paraphrase Retrieval (QPR) system is to retrieve equivalent questions that result in the same answer as the original question. Such a system can be used to understand and answer rare and noisy reformulations of common questions by mapping them to a set of canonical forms. This has large-scale applications for community Question Answering (cQA) and open-domain spoken language question answering systems. In this paper we describe a new QPR system implemented as a Neural Information Retrieval (NIR) system consisting of a neural network sentence encoder and an approximate k-Nearest Neighbour index for efficient vector retrieval. We also describe our mechanism to generate an annotated dataset for question paraphrase retrieval experiments automatically from question-answer logs via distant supervision. We show that the standard loss function in NIR, triplet loss, does not perform well with noisy labels. We propose smoothed deep metric loss (SDML) and with our experiments on two QPR datasets we show that it significantly outperforms triplet loss in the noisy label setting

    Social Bots: Human-Like by Means of Human Control?

    Get PDF
    Social bots are currently regarded an influential but also somewhat mysterious factor in public discourse and opinion making. They are considered to be capable of massively distributing propaganda in social and online media and their application is even suspected to be partly responsible for recent election results. Astonishingly, the term `Social Bot' is not well defined and different scientific disciplines use divergent definitions. This work starts with a balanced definition attempt, before providing an overview of how social bots actually work (taking the example of Twitter) and what their current technical limitations are. Despite recent research progress in Deep Learning and Big Data, there are many activities bots cannot handle well. We then discuss how bot capabilities can be extended and controlled by integrating humans into the process and reason that this is currently the most promising way to go in order to realize effective interactions with other humans.Comment: 36 pages, 13 figure

    Hybrid Spam Filtering for Mobile Communication

    Full text link
    Spam messages are an increasing threat to mobile communication. Several mitigation techniques have been proposed, including white and black listing, challenge-response and content-based filtering. However, none are perfect and it makes sense to use a combination rather than just one. We propose an anti-spam framework based on the hybrid of content-based filtering and challenge-response. There is the trade-offs between accuracy of anti-spam classifiers and the communication overhead. Experimental results show how, depending on the proportion of spam messages, different filtering %%@ parameters should be set.Comment: 6 pages, 5 figures, 1 tabl
    • …
    corecore