2,600 research outputs found

    Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-164).There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration.by Jingjing Liu.Ph.D

    An Introduction to Social Semantic Web Mining & Big Data Analytics for Political Attitudes and Mentalities Research

    Full text link
    The social web has become a major repository of social and behavioral data that is of exceptional interest to the social science and humanities research community. Computer science has only recently developed various technologies and techniques that allow for harvesting, organizing and analyzing such data and provide knowledge and insights into the structure and behavior or people on-line. Some of these techniques include social web mining, conceptual and social network analysis and modeling, tag clouds, topic maps, folksonomies, complex network visualizations, modeling of processes on networks, agent based models of social network emergence, speech recognition, computer vision, natural language processing, opinion mining and sentiment analysis, recommender systems, user profiling and semantic wikis. All of these techniques are briefly introduced, example studies are given and ideas as well as possible directions in the field of political attitudes and mentalities are given. In the end challenges for future studies are discussed

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    Digital Twins for Industry 4.0 in the 6G Era

    Full text link
    Having the Fifth Generation (5G) mobile communication system recently rolled out in many countries, the wireless community is now setting its eyes on the next era of Sixth Generation (6G). Inheriting from 5G its focus on industrial use cases, 6G is envisaged to become the infrastructural backbone of future intelligent industry. Especially, a combination of 6G and the emerging technologies of Digital Twins (DT) will give impetus to the next evolution of Industry 4.0 (I4.0) systems. This article provides a survey in the research area of 6G-empowered industrial DT system. With a novel vision of 6G industrial DT ecosystem, this survey discusses the ambitions and potential applications of industrial DT in the 6G era, identifying the emerging challenges as well as the key enabling technologies. The introduced ecosystem is supposed to bridge the gaps between humans, machines, and the data infrastructure, and therewith enable numerous novel application scenarios.Comment: Accepted for publication in IEEE Open Journal of Vehicular Technolog

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Making sense of text: artificial intelligence-enabled content analysis

    Get PDF
    Purpose: The purpose of this paper is to introduce, apply and compare how artificial intelligence (AI), and specifically the IBM Watson system, can be used for content analysis in marketing research relative to manual and computer-aided (non-AI) approaches to content analysis. Design/methodology/approach: To illustrate the use of AI-enabled content analysis, this paper examines the text of leadership speeches, content related to organizational brand. The process and results of using AI are compared to manual and computer-aided approaches by using three performance factors for content analysis: reliability, validity and efficiency. Findings: Relative to manual and computer-aided approaches, AI-enabled content analysis provides clear advantages with high reliability, high validity and moderate efficiency. Research limitations/implications: This paper offers three contributions. First, it highlights the continued importance of the content analysis research method, particularly with the explosive growth of natural language-based user-generated content. Second, it provides a road map of how to use AI-enabled content analysis. Third, it applies and compares AI-enabled content analysis to manual and computer-aided, using leadership speeches. Practical implications: For each of the three approaches, nine steps are outlined and described to allow for replicability of this study. The advantages and disadvantages of using AI for content analysis are discussed. Together these are intended to motivate and guide researchers to apply and develop AI-enabled content analysis for research in marketing and other disciplines. Originality/value: To the best of the authors' knowledge, this paper is among the first to introduce, apply and compare how AI can be used for content analysis
    corecore