133 research outputs found

    Role of sentiment classification in sentiment analysis: a survey

    Get PDF
    Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results

    Suomenkielisen puhepohjaisen dialogijärjestelmän kehitys koulutusrobottiin

    Get PDF
    Spoken dialog systems are coming in the every day life, for example in the personal assistants such as Siri from Apple. However, spoken dialog systems could be used in a vast range of products. In this thesis a spoken dialog system prototype was developed to be used in an educational robot. The main problem in an educational robot to recognize children's speech. The speech of the children varies significantly between speakers, which makes it more difficult to recognize with a single acoustic model. The main focus of the thesis is in the speech recognition and adaptation. The acoustic model used is trained with data gathered from adults and then adapted with the data from children. The adaptation is done for each speaker separately and also as an average child adaptation. The results are compared to the commercial speech recognizer developed by Google Inc. The experiments show that, when adapting the adult model with data from each speaker separately word error rate can be decreased from 8.1 % to 2.4 % and with the average adaptation to 3.1 %. The adaptation that was used was vocal tract length normalization (VTLN) and constrained maximum likelihood linear regression (CMLLR) combined. In comparison word error rate of the commercial product used is 7.4 %.Applen puhelimissa olevan assistentti Sirin tavoin puhepohjaiset dialogijärjestelmät ovat tulossa osaksi jokapäiväistä elämäämme. Puhepohjaisia dialogijärjestelmiä voi kuitenkin käyttää myös monissa muissakin sovelluksissa. Tässä diplomityössä sdialogijärjestelmän prototyyppi kehitettiin käytettäväksi koulutusrobotissa. Suurin haaste koulutusrobotissa on lapsien automaattinen puheentunnistus. Lasten puhe on hyvin vaihtelevaa puhujien välillä, minkä takia puheentunnistus on hyvin vaikeaa yhtä akustista mallia käyttämällä. Tämä diplomityö keskittyy pääasiassa puheentunnistukseen ja akustisen mallin adaptointiin. Akustista mallia, joka on opetettu aikuisten puheella, adaptoidaan, jotta se antaisi parempia tuloksia lasten puheen tunnistuksessa. Adaptointi tehdään kahdella tavalla: puhuja adaptointina ja keskimääräisenä lapsiadaptointina. Tuloksia verrataan Googlen kehittämään kaupalliseen puheentunnistimeen. Kokeet osoittavat, että adaptoimalla aikusten akustista mallia puhuja kohtaisesti sanavirheprosentti (WER) saatiin laskemaan 8.1 %:sta 2.4 %:iin ja Keskimääräisellä lapsiadaptoinnilla taas 3.1 %:iin. Adaptointiin käytettiin Vocal tract length normalization (VTLN) sekä Constrained maximum likelihood linear regression (CMLLR) -tekniikoita erikseen ja yhdistettynä. Vertailukohtana käytettiin Googlen puheentunnistimen sanavirheprosenttia 7.4 %

    Text Analytics: the convergence of Big Data and Artificial Intelligence

    Get PDF
    The analysis of the text content in emails, blogs, tweets, forums and other forms of textual communication constitutes what we call text analytics. Text analytics is applicable to most industries: it can help analyze millions of emails; you can analyze customers’ comments and questions in forums; you can perform sentiment analysis using text analytics by measuring positive or negative perceptions of a company, brand, or product. Text Analytics has also been called text mining, and is a subcategory of the Natural Language Processing (NLP) field, which is one of the founding branches of Artificial Intelligence, back in the 1950s, when an interest in understanding text originally developed. Currently Text Analytics is often considered as the next step in Big Data analysis. Text Analytics has a number of subdivisions: Information Extraction, Named Entity Recognition, Semantic Web annotated domain’s representation, and many more. Several techniques are currently used and some of them have gained a lot of attention, such as Machine Learning, to show a semisupervised enhancement of systems, but they also present a number of limitations which make them not always the only or the best choice. We conclude with current and near future applications of Text Analytics

    On the Development of Adaptive and User-Centred Interactive Multimodal Interfaces

    Get PDF
    Multimodal systems have attained increased attention in recent years, which has made possible important improvements in the technologies for recognition, processing, and generation of multimodal information. However, there are still many issues related to multimodality which are not clear, for example, the principles that make it possible to resemble human-human multimodal communication. This chapter focuses on some of the most important challenges that researchers have recently envisioned for future multimodal interfaces. It also describes current efforts to develop intelligent, adaptive, proactive, portable and affective multimodal interfaces

    Searching Spontaneous Conversational Speech:Proceedings of ACM SIGIR Workshop (SSCS2008)

    Get PDF

    Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-164).There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration.by Jingjing Liu.Ph.D

    Development of a text mining approach to disease network discovery

    Get PDF
    Scientific literature is one of the major sources of knowledge for systems biology, in the form of papers, patents and other types of written reports. Text mining methods aim at automatically extracting relevant information from the literature. The hypothesis of this thesis was that biological systems could be elucidated by the development of text mining solutions that can automatically extract relevant information from documents. The first objective consisted in developing software components to recognize biomedical entities in text, which is the first step to generate a network about a biological system. To this end, a machine learning solution was developed, which can be trained for specific biological entities using an annotated dataset, obtaining high-quality results. Additionally, a rule-based solution was developed, which can be easily adapted to various types of entities. The second objective consisted in developing an automatic approach to link the recognized entities to a reference knowledge base. A solution based on the PageRank algorithm was developed in order to match the entities to the concepts that most contribute to the overall coherence. The third objective consisted in automatically extracting relations between entities, to generate knowledge graphs about biological systems. Due to the lack of annotated datasets available for this task, distant supervision was employed to train a relation classifier on a corpus of documents and a knowledge base. The applicability of this approach was demonstrated in two case studies: microRNAgene relations for cystic fibrosis, obtaining a network of 27 relations using the abstracts of 51 recently published papers; and cell-cytokine relations for tolerogenic cell therapies, obtaining a network of 647 relations from 3264 abstracts. Through a manual evaluation, the information contained in these networks was determined to be relevant. Additionally, a solution combining deep learning techniques with ontology information was developed, to take advantage of the domain knowledge provided by ontologies. This thesis contributed with several solutions that demonstrate the usefulness of text mining methods to systems biology by extracting domain-specific information from the literature. These solutions make it easier to integrate various areas of research, leading to a better understanding of biological systems

    A Spoken Dialogue Analysis Platform for Effective Counselling

    Get PDF
    This paper proposes a spoken dialogue analysis platform (SDAP) that could assist counsellors in person-to-person counselling by analysing counselling conversations and providing key information that could enhance the counsellors\u27 understanding of the counselees\u27 conditions and situations. The proposed platform has two main modules: a speech recognition module and a text analysis module that are specifically built for the Korean language. The speech recognition module uses NAVER CLOVA Speech service to convert voice recordings of counselling dialogues into text. The Korean text analysis environment of the text analysis module was built using NLTK, KoNLPy and scikit-learn library, and, for now, the module provides two types of text analysis: keyword analysis and sentiment analysis. The results of the text analyses that provide keywords and analysis of customers\u27 emotional state can help counsellors to provide appropriate feedback to the counselees easily and more quickly, making the counselling fast and effective and reducing the counselees\u27 waiting time. In the experiments, the text analysis module building process is elaborated in detail, and the usefulness of the proposed SDAP is exemplified by case studies on actual counselling conversations at a dental clinic and a fitness centre

    Multimedia Retrieval

    Get PDF
    corecore