Search CORE

47 research outputs found

Data-Driven Audio Feature Space Clustering for Automatic Sound Recognition in Radio Broadcast News

Author: Alexandros Lazaridis
Casey M.
Dempster A. P.
Eyben F.
Iosif Mporas
Nikos Fakotakis
Perperis T.
Theodoros Theodorou
Wollmer M.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 23/12/2016
Field of study

This is an Open Access article published by World Scientific Publishing Company. It is distributed under the terms of the Creative Commons Attribution 4.0 (CC-BY) License. Further distribution of this work is permitted, provided the original work is properly cited. T. Theodorou, I. Mpoas, A. Lazaridis, N. Fakotakis, 'Data-Driven Audio Feature Space Clustering for Automatic Sound Recognition in Radio Broadcast News', International Journal on Artificial Intelligence Tools, Vol. 26 (2), April 2017, 1750005 (13 pages), DOI: 10.1142/S021821301750005. © The Author(s).In this paper we describe an automatic sound recognition scheme for radio broadcast news based on principal component clustering with respect to the discrimination ability of the principal components. Specifically, streams of broadcast news transmissions, labeled based on the audio event, are decomposed using a large set of audio descriptors and project into the principal component space. A data-driven algorithm clusters the relevance of the components. The component subspaces are used by sound type classifier. This methodology showed that the k-nearest neighbor and the artificial intelligent network provide good results. Also, this methodology showed that discarding unnecessary dimension works in favor on the outcome, as it hardly deteriorates the effectiveness of the algorithms.Peer reviewe

Crossref

University of Hertfordshire Research Archive

SEWA DB: A rich database for audio-visual emotion and sentiment research in the wild

Author: Hajiyev E
Han J
Kossaifi J
Panagakis Y
Pandit V
Pantic M
Ringeval F
Schmitt M
Schuller BW
Shen J
Star K
Toisoul A
Walecki R
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2019
Field of study

Natural human-computer interaction and audio-visual human behaviour sensing systems, which would achieve robust performance in-the-wild are more needed than ever as digital devices are becoming indispensable part of our life more and more. Accurately annotated real-world data are the crux in devising such systems. However, existing databases usually consider controlled settings, low demographic variability, and a single task. In this paper, we introduce the SEWA database of more than 2000 minutes of audio-visual data of 398 people coming from six cultures, 50% female, and uniformly spanning the age range of 18 to 65 years old. Subjects were recorded in two different contexts: while watching adverts and while discussing adverts in a video chat. The database includes rich annotations of the recordings in terms of facial landmarks, facial action units (FAU), various vocalisations, mirroring, and continuously valued valence, arousal, liking, agreement, and prototypic examples of (dis)liking. This database aims to be an extremely valuable resource for researchers in affective computing and automatic human sensing and is expected to push forward the research in human behaviour analysis, including cultural studies. Along with the database, we provide extensive baseline experiments for automatic FAU detection and automatic valence, arousal and (dis)liking intensity estimation

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Interactive Technologies for the Public Sphere Toward a Theory of Critical Creative Technology

Author: Jennings Pamela Lynnette
Publication venue: 'University of Plymouth'
Publication date: 01/01/2006
Field of study

Digital media cultural practices continue to address the social, cultural and aesthetic contexts of the global information economy, perhaps better called ecology, by inventing new methods and genres that encourage interactive engagement, collaboration, exploration and learning. The theoretical framework for creative critical technology evolved from the confluence of the arts, human computer interaction, and critical theories of technology. Molding this nascent theoretical framework from these seemingly disparate disciplines was a reflexive process where the influence of each component on each other spiraled into the theory and practice as illustrated through the Constructed Narratives project. Research that evolves from an arts perspective encourages experimental processes of making as a method for defining research principles. The traditional reductionist approach to research requires that all confounding variables are eliminated or silenced using methods of statistics. However, that noise in the data, those confounding variables provide the rich context, media, and processes by which creative practices thrive. As research in the arts gains recognition for its contributions of new knowledge, the traditional reductive practice in search of general principles will be respectfully joined by methodologies for defining living principles that celebrate and build from the confounding variables, the data noise. The movement to develop research methodologies from the noisy edges of human interaction have been explored in the research and practices of ludic design and ambiguity (Gaver, 2003); affective gap (Sengers et al., 2005b; 2006); embodied interaction (Dourish, 2001); the felt life (McCarthy & Wright, 2004); and reflective HCI (Dourish, et al., 2004). The theory of critical creative technology examines the relationships between critical theories of technology, society and aesthetics, information technologies and contemporary practices in interaction design and creative digital media. The theory of critical creative technology is aligned with theories and practices in social navigation (Dourish, 1999) and community-based interactive systems (Stathis, 1999) in the development of smart appliances and network systems that support people in engaging in social activities, promoting communication and enhancing the potential for learning in a community-based environment. The theory of critical creative technology amends these community-based and collaborative design theories by emphasizing methods to facilitate face-to-face dialogical interaction when the exchange of ideas, observations, dreams, concerns, and celebrations may be silenced by societal norms about how to engage others in public spaces. The Constructed Narratives project is an experiment in the design of a critical creative technology that emphasizes the collaborative construction of new knowledge about one's lived world through computer-supported collaborative play (CSCP). To construct is to creatively invent one's world by engaging in creative decision-making, problem solving and acts of negotiation. The metaphor of construction is used to demonstrate how a simple artefact - a building block - can provide an interactive platform to support discourse between collaborating participants. The technical goal for this project was the development of a software and hardware platform for the design of critical creative technology applications that can process a dynamic flow of logistical and profile data from multiple users to be used in applications that facilitate dialogue between people in a real-time playful interactive experience

Plymouth Electronic Archive and Research Library

OpenGrey Repository

Recent Advances in Social Data and Artificial Intelligence 2019

Author
Publication venue: 'MDPI AG'
Publication date: 12/08/2022
Field of study

The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace

Directory of Open Access Books (DOAB)

Decoding the Mental States of Focus and Distraction in a Real Life Setting of Tibetan Monastic Deabtes Using EEG and Machine Learning

Author: Kaushik Pallavi
Roy Partha Pratim
van Vugt Marieke
Publication venue: Applied Cognitive Science Lab, Penn State
Publication date: 01/01/2020
Field of study

ARTS repository - University of Groningen

Grafting Acoustic Instruments and Signal Processing: Creative Control and Augmented Expressivity

Author: Freed Adrian
Overholt Daniel
Publication venue
Publication date: 01/01/2013
Field of study

VBN

Robust multi-stream keyword and non-linguistic vocalization detection for computationally intelligent virtual agents

Author: Marchi Erik
Schuller Björn
Squartini Stefano
Wöllmer Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/05/2020
Field of study

OPUS Augsburg

Proceedings of the 7th Sound and Music Computing Conference

Author: Emilia Gómez
Perfecto Herrera
Rafael Ramirez
Publication venue: SMC Network
Publication date: 25/07/2010
Field of study

Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

ZENODO

Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents

Author: Marchi E.
Schuller B.
Squartini S.
Woellmer M.
Publication venue: Springer Verlag Germany:Tiergartenstrasse 17, D 69121 Heidelberg Germany:011 49 6221 3450, EMAIL: [email protected], INTERNET: http://www.springer.de, Fax: 011 49 6221 345229
Publication date
Field of study

Systems for keyword and non-linguistic vocalization detection in conversational agent applications need to be robust with respect to background noise and different speaking styles. Focussing on the Sensitive Artificial Listener (SAL) scenario which involves spontaneous, emotionally colored speech, this paper proposes a multi-stream model that applies the principle of Long Short-Term Memory to generate context-sensitive phoneme predictions which can be used for keyword detection. Further, we investigate the incorporation of noisy training material in order to create noise robust acoustic models. We show that both strategies can improve recognition performance when evaluated on spontaneous human-machine conversations as contained in the SEMAINE database

IRIS UniversitÃ Politecnica delle Marche