574 research outputs found

    PersoNER: Persian named-entity recognition

    Full text link
    Š 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-164).There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration.by Jingjing Liu.Ph.D

    A Review on Human-Computer Interaction and Intelligent Robots

    Get PDF
    In the field of artificial intelligence, human–computer interaction (HCI) technology and its related intelligent robot technologies are essential and interesting contents of research. From the perspective of software algorithm and hardware system, these above-mentioned technologies study and try to build a natural HCI environment. The purpose of this research is to provide an overview of HCI and intelligent robots. This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction. Based on these same technologies, this research introduces some intelligent robot systems and platforms. This paper also forecasts some vital challenges of researching HCI and intelligent robots. The authors hope that this work will help researchers in the field to acquire the necessary information and technologies to further conduct more advanced research
    • …
    corecore