4,762 research outputs found

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Emotion-aware voice interfaces based on speech signal processing

    Get PDF
    Voice interfaces (VIs) will become increasingly widespread in current daily lives as AI techniques progress. VIs can be incorporated into smart devices like smartphones, as well as integrated into autos, home automation systems, computer operating systems, and home appliances, among other things. Current speech interfaces, however, are unaware of usersโ€™ emotional states and hence cannot support real communication. To overcome these limitations, it is necessary to implement emotional awareness in future VIs. This thesis focuses on how speech signal processing (SSP) and speech emotion recognition (SER) can enable VIs to gain emotional awareness. Following an explanation of what emotion is and how neural networks are implemented, this thesis presents the results of several user studies and surveys. Emotions are complicated, and they are typically characterized using category and dimensional models. They can be expressed verbally or nonverbally. Although existing voice interfaces are unaware of usersโ€™ emotional states and cannot support natural conversations, it is possible to perceive usersโ€™ emotions by speech based on SSP in future VIs. One section of this thesis, based on SSP, investigates mental restorative e๏ฌ€ects on humans and their measures from speech signals. SSP is less intrusive and more accessible than traditional measures such as attention scales or response tests, and it can provide a reliable assessment for attention and mental restoration. SSP can be implemented into future VIs and utilized in future HCI user research. The thesis then moves on to present a novel attention neural network based on sparse correlation features. The detection accuracy of emotions in the continuous speech was demonstrated in a user study utilizing recordings from a real classroom. In this section, a promising result will be shown. In SER research, it is unknown if existing emotion detection methods detect acted emotions or the genuine emotion of the speaker. Another section of this thesis is concerned with humansโ€™ ability to act on their emotions. In a user study, participants were instructed to imitate five fundamental emotions. The results revealed that they struggled with this task; nevertheless, certain emotions were easier to replicate than others. A further study concern is how VIs should respond to usersโ€™ emotions if SER techniques are implemented in VIs and can recognize usersโ€™ emotions. The thesis includes research on ways for dealing with the emotions of users. In a user study, users were instructed to make sad, angry, and terrified VI avatars happy and were asked if they would like to be treated the same way if the situation were reversed. According to the results, the majority of participants tended to respond to these unpleasant emotions with neutral emotion, but there is a di๏ฌ€erence among genders in emotion selection. For a human-centered design approach, it is important to understand what the usersโ€™ preferences for future VIs are. In three distinct cultures, a questionnaire-based survey on usersโ€™ attitudes and preferences for emotion-aware VIs was conducted. It was discovered that there are almost no gender di๏ฌ€erences. Cluster analysis found that there are three fundamental user types that exist in all cultures: Enthusiasts, Pragmatists, and Sceptics. As a result, future VI development should consider diverse sorts of consumers. In conclusion, future VIs systems should be designed for various sorts of users as well as be able to detect the usersโ€™ disguised or actual emotions using SER and SSP technologies. Furthermore, many other applications, such as restorative e๏ฌ€ects assessments, can be included in the VIs system

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and โ€œenablersโ€, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    ACII 2009: Affective Computing and Intelligent Interaction. Proceedings of the Doctoral Consortium 2009

    Get PDF

    ๋น„์œ ์–ธ์–ด์™€ ๋ฌธ๋งฅ ํ‘œ์ง€๋ฅผ ์ด์šฉํ•œ ๋ฐ˜์–ด๋ฒ• ์ž๋™ ๋ถ„๋ฅ˜ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์–ธ์–ดํ•™๊ณผ, 2014. 8. ์‹ ํšจํ•„.๋ณธ ๋…ผ๋ฌธ์€ ๊ณ ๋นˆ๋„ ๋น„์œ ์–ธ์–ด(figurative language)๋ฅผ ์ด์šฉํ•œ ๋ฐ˜์–ด๋ฒ• ์ž๋™ ์ธ์‹ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋ฐ˜์–ด๋ฒ•๊ณผ ๋น„์œ ์–ธ์–ด๋“ค(์ง์œ ๋ฒ•, ์€์œ ๋ฒ•, ์˜์ธ๋ฒ•, ๊ณผ์žฅ๋ฒ•)์„ ์ธ์‹ํ•˜๋Š” ๋ฌธ์ œ๋Š” ์ปดํ“จํ„ฐ ์–ธ์–ดํ•™์—์„œ ๋งค์šฐ ์ค‘์š”ํ•œ ๋ถ„์•ผ์ด๋‹ค. ์ด๋Ÿฐ ๋น„์œ  ์–ธ์–ด๋“ค์€ ํ‘œ๋ฉด์ ์ธ ์˜๋ฏธ์™€ ๋‹ค๋ฅธ ์˜๋ฏธ๋ฅผ ๋‚ดํฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ ๋ฌธ์žฅ์˜ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ์—ฐ๊ตฌ์ด๋‹ค. ๊ณผ์žฅ๋ฒ•์ด๋‚˜ ๊ณผ์†Œ ๋ฒ• ๊ฐ™์€ ๋น„์œ ์–ธ์–ด์™€ ๋‹ฌ๋ฆฌ ํŠน๋ณ„ํžˆ ๋ฐ˜์–ด๋ฒ•์€ ๊ทธ ํ‘œํ˜„์  ์˜๋ฏธ์™€ ์ • ๋ฐ˜๋Œ€ ๋˜๋Š” ๋ถ€ํ•ฉํ•˜์ง€ ์•Š๋Š” ์˜๋ฏธ๋ฅผ ๋‚ดํฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋”์šฑ ๋ฌธ์ œ๊ฐ€ ๋œ๋‹ค. ๊ตฌ์–ด์—์„œ ๋ฐ˜์–ด๋ฒ•์ด ์‚ฌ์šฉ๋  ๋•Œ๋Š” ์šด์œจ์ด๋ผ๋Š” ์š”์†Œ๊ฐ€ ์ธ์‹์— ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๋Š” ๋ฐ˜๋ฉด, ๋ฌธ์–ด์—์„œ ๋ฐ˜์–ด๋ฒ•์€ ์šด์œจ ์ •๋ณด๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๋” ์ธ์‹์ด ์–ด๋ ต๋‹ค. ๋˜ํ•œ, ๋ฐ˜์–ด๋ฒ•์€ ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ํ‘œ๋ฉด์ ์œผ๋กœ ๋‚˜ํƒ€๋‚˜๋Š” ๋ช…ํ™•ํ•œ ๋‹จ์„œ๋ฅผ ํฌํ•จํ•˜์ง€ ์•Š๊ณ , ๋‹จ์ง€ ์ค€์–ธ์–ด์ , ๋ฌธ๋งฅ์  ํ™”์šฉ์ ์ธ ๋‹จ์„œ๋งŒ์„ ๊ฐ–๊ธฐ ๋•Œ๋ฌธ์— ์ธ์‹์— ๋” ์–ด๋ ค์›€์ด ํฌ๋‹ค. ๋ฐ˜์–ด๋ฒ•์˜ ๋‹จ์„œ๊ฐ€ ๋˜๋Š” ์˜ˆ๋กœ๋Š” ์ฒญ์ž์—๊ฒŒ ๋ฌธ์ž ๊ทธ๋Œ€๋กœ ์ดํ•ด๋˜๊ธฐ๋ฅผ ๋ฐ”๋ผ์ง€ ์•Š์Œ์„ ์•”์‹œํ•˜๋Š” ๊ณผ์žฅ๋ฒ•, ๊ณผ์†Œ๋ฒ•, ์ˆ˜์‚ฌ์  ์งˆ๋ฌธ๋ฒ•, ๋ถ€๊ฐ€ ์˜๋ฌธ๋ฌธ ๊ฐ™์€ ๊ฒƒ๋“ค์ด ์กด์žฌํ•œ๋‹ค. ๋ณธ๊ณ ๋Š” ๋™์‹œ์— ๋‚˜ํƒ€๋‚˜๋Š” ๋น„์œ ์–ธ์–ด๋“ค์„ ๊ฐ๊ฐ ์ธ์‹ํ•˜์—ฌ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜์–ด๋ฒ• ๊ฒ€์ถœ๊ธฐ์— ์ œ๊ณตํ•˜๋Š” ๋ฐฉ์‹์˜ ๋ถ„ํ• -์ •๋ณต๋ฒ•์„ ์†Œ๊ฐœํ•œ๋‹ค. ์งง์€ ๊ธธ์ด์˜ ํŠธ์œ„ํ„ฐ์™€ ์ƒ๋Œ€์ ์œผ๋กœ ๊ธด ์•„๋งˆ์กด ์ƒํ’ˆํ‰์— ๋Œ€ํ•ด ์‹คํ–‰ํ•œ ์‹คํ—˜์€ ์ด๋Ÿฌํ•œ ๋น„์œ ์–ธ์–ด๋“ค์„ ๊ฐœ๋ณ„์ ์œผ๋กœ ์ธ์‹ํ•˜์—ฌ ๋ฐ˜์–ด๋ฒ•์˜ ์ž๋™ ์ธ์‹์— ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋น„์œ ์–ธ์–ด๋“ค์„ ํ•œ๋ฒˆ์— ์ธ์‹ํ•˜๋Š” ๋ฐฉ๋ฒ• ๋ณด๋‹ค ๋ฐ˜์–ด๋ฒ• ์ธ์‹์— ํšจ๊ณผ์ ์ด๋ผ๋Š” ์‚ฌ์‹ค์„ ๋ฐํ˜”๋‹ค. ๋˜ํ•œ, ์ง€๊ธˆ๊นŒ์ง€ ๊ฐœ๋ณ„์ ์œผ๋กœ ์ œํ•œ๋œ ๋ฌธ๋งฅ๋งŒ์„ ๊ณ ๋ คํ•œ ๊ณผ์žฅ๋ฒ•, ๊ณผ์†Œ๋ฒ• ์—ฐ๊ตฌ์™€ ๋‹ฌ๋ฆฌ ๋ณธ ์—ฐ๊ตฌ๋Š” ๋ฐ˜์–ด๋ฒ• ์ธ์‹์— ์‚ฌ์šฉ๋˜๋Š” ๊ธฐ์กด์˜ ์—ฐ๊ตฌ ๋ฐฉ๋ฒ•์„ ๊ณผ์žฅ๋ฒ•๊ณผ ๊ณผ์†Œ๋ฒ• ์ธ์‹์—๋„ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ์ œ์‹œํ•˜์˜€๋‹ค๋Š” ์˜์˜๊ฐ€ ์žˆ๋‹ค.This thesis proposes a linguistic-based irony detection method which uses these frequently co-occurring figurative languages to identify areas where irony is likely to occur. The detection and proper interpretation of irony and other figurative languages represents an important area of research for Computational Linguistics. Since figurative languages typically convey meanings which differ from their literal interpretations, interpreting such utterances at face value is likely to give incorrect results. Irony in particular represents a special challenge as, unlike some figurative languages like hyperbole or understatement which express sentiments which are more-or-less in line with their literal interpretation, differing only in intensity, ironic utterances convey intended meanings incongruent with โ€“ or even the exact opposite of โ€“ their literal interpretation. Compounding the need for effective irony detection is ironys near ubiquitous use in online writings and computer-mediated communications, both of which are commonly used in Computational Linguistics experiments. While irony in spoken contexts tends to be denoted using prosody, irony in written contexts is much harder to detect. One of the major difficulties is that irony typically does not present with any explicit clues such as punctuation marks or verbal inflections. Instead, irony tends to be denoted using paralinguistic, contextual, or pragmatic cues. Among these are the co-occurrence of figurative languages such as hyperbole, understatement, rhetorical questions, tag questions, or other ironic utterances which alert the listener that the speaker does not expect to be interpreted literally. This thesis introduces a divide-and-conquer approach to irony detection where co-occurring figurative languages are identified independently and then fed into an overall irony detector. Experiments on both short-form Twitter tweets and longer-form Amazon product reviews show not only that co-textual figurative languages are useful in the automatic classification of irony but that identifying these co-occurring figurative languages separately yields better overall irony detection by resolving conflicts between conflicting features, such as those for hyperbole and understatement. This thesis also introduces detection methods for hyperbole and understatement in general contexts by adapting existing approaches to irony detection. Before this point hyperbole detection was focused only on specialized contexts while understatement detection had been largely ignored. Experiments show that these proposed automated hyperbole and understatement detection methods outperformed methods which rely on fixed vocabularies.1 Introduction 1 1.1 What is Irony? 2 1.2 Irony and Co-textual Markers 4 1.2.1 Hyperbole 6 1.2.2 Understatement 7 1.2.3 Rhetorical Questions 8 1.2.4 Tag Questions 9 2 Previous Works 10 2.1 Irony Detection 10 2.2 Detection of Co-textual Markers 12 3 Data Collection 15 3.1 Twitter Data 15 3.1.1 Twitter Irony Corpus 18 3.1.2 Twitter Hyperbole Corpus 18 3.1.3 Twitter Understatement Corpus 18 3.2 Amazon Data 19 4 Experimental Set-up 21 4.1 Hyperbole Detection 22 4.2 Understatement Detection 23 4.3 Rhetorical Question Detection 25 4.4 Tag Question Detection 27 4.5 Irony Detection 28 4.5.1 Twitter Data 30 4.5.2 Amazon Product Review Data 30 5 Results and Discussion 33 5.1 Hyperbole 33 5.2 Understatement 39 5.3 Irony 44 5.3.1 Twitter 44 5.3.2 Amazon Product Reviews 50 6 Conclusions and Future Work 57 7 References 60 Appendix 1 Hyperbole Word List 66 Appendix 2 Hedge Word List 69Maste
    • โ€ฆ
    corecore