1,914 research outputs found
Active inference, selective attention, and the cocktail party problem
In this paper, we introduce a new generative model for an active inference account of preparatory and selective attention, in the context of a classic ‘cocktail party’ paradigm. In this setup, pairs of words are presented simultaneously to the left and right ears and an instructive spatial cue directs attention to the left or right. We use this generative model to test competing hypotheses about the way that human listeners direct preparatory and selective attention. We show that assigning low precision to words at attended—relative to unattended—locations can explain why a listener reports words from a competing sentence. Under this model, temporal changes in sensory precision were not needed to account for faster reaction times with longer cue-target intervals, but were necessary to explain ramping effects on event-related potentials (ERPs)—resembling the contingent negative variation (CNV)—during the preparatory interval. These simulations reveal that different processes are likely to underlie the improvement in reaction times and the ramping of ERPs that are associated with spatial cueing
Timbre from Sound Synthesis and High-level Control Perspectives
International audienceExploring the many surprising facets of timbre through sound manipulations has been a common practice among composers and instrument makers of all times. The digital era radically changed the approach to sounds thanks to the unlimited possibilities offered by computers that made it possible to investigate sounds without physical constraints. In this chapter we describe investigations on timbre based on the analysis by synthesis approach that consists in using digital synthesis algorithms to reproduce sounds and further modify the parameters of the algorithms to investigate their perceptual relevance. In the first part of the chapter timbre is investigated in a musical context. An examination of the sound quality of different wood species for xylophone making is first presented. Then the influence of instrumental control on timbre is described in the case of clarinet and cello performances. In the second part of the chapter, we mainly focus on the identification of sound morphologies, so called invariant sound structures responsible for the evocations induced by environmental sounds by relating basic signal descriptors and timbre descriptors to evocations in the case of car door noises, motor noises, solid objects, and their interactions
The dynamics of reading development in L2 English for academic purposes
In a mixed-methods approach, this study investigates the complex and dynamic developmental trajectories of 27 Chinese Chemistry major undergraduates’ English academic reading ability. Twelve parallel tests were designed, validated, and used weekly during one semester. The analyses included a group pre-post design to measure academic reading gains, a regression analysis to predict beginning reading score with English proficiency and Chemistry knowledge as predictors, individual longitudinal case studies to measure variability and phase shifts, and a cluster analysis to discover (un)common developmental patterns. Finally, a qualitative study used interviews to discover difficulties in reading and strategies to overcome them. English proficiency predicted the initial reading score and the group gained significantly in academic reading. Each learner showed different non-linear patterns, and a cluster analysis revealed few similar patterns among learners. The high gainers showed relatively more variability over time and used more and a wider variety and more sophisticated learning and reading strategies to improve.</p
Assessing the quality of audio and video components in desktop multimedia conferencing
This thesis seeks to address the HCI (Human-Computer Interaction) research problem of how to establish the level of audio and video quality that end users require to successfully perform tasks via networked desktop videoconferencing. There are currently no established HCI methods of assessing the perceived quality of audio and video delivered in desktop videoconferencing. The transport of real-time speech and video information across new digital networks causes novel and different degradations, problems and issues to those common in the traditional telecommunications areas (telephone and television). Traditional assessment methods involve the use of very short test samples, are traditionally conducted outside a task-based environment, and focus on whether a degradation is noticed or not. But these methods cannot help establish what audio-visual quality is required by users to perform tasks successfully with the minimum of user cost, in interactive conferencing environments. This thesis addresses this research gap by investigating and developing a battery of assessment methods for networked videoconferencing, suitable for use in both field trials and laboratory-based studies. The development and use of these new methods helps identify the most critical variables (and levels of these variables) that affect perceived quality, and means by which network designers and HCI practitioners can address these problems are suggested. The output of the thesis therefore contributes both methodological (i.e. new rating scales and data-gathering methods) and substantive (i.e. explicit knowledge about quality requirements for certain tasks) knowledge to the HCI and networking research communities on the subjective quality requirements of real-time interaction in networked videoconferencing environments. Exploratory research is carried out through an interleaved series of field trials and controlled studies, advancing substantive and methodological knowledge in an incremental fashion. Initial studies use the ITU-recommended assessment methods, but these are found to be unsuitable for assessing networked speech and video quality for a number of reasons. Therefore later studies investigate and establish a novel polar rating scale, which can be used both as a static rating scale and as a dynamic continuous slider. These and further developments of the methods in future lab- based and real conferencing environments will enable subjective quality requirements and guidelines for different videoconferencing tasks to be established
Recommended from our members
Topology of spatial texture in the acousmatic medium
This research explores the dynamic fabric of experienced space in acousmatic music. The topology of spatial texture is a network of concepts treating music as a flexible, textural space, which deforms, shapes, and transforms in time. A comprehensive terminology is introduced, along with five fixed-media electroacoustic compositions, which exemplify a manifestation of spatial texture in composition and musical thinking.
The theory draws from research on the cross-modality of texture perception, philosophical discourse on embodied meaning, physics, psychology of visual art, and discourse on space in acousmatic music. Several different structural perspectives are discussed, which reveal how spatial texture incorporates lower sound-structural levels, materiality, states and processes, motion, global networks and terrains, and relationships between space and time. Emphasis is put on visual and physical connections with spatiality in the acousmatic experience: cogency in spatial structure and dynamics reinforces links among modalities.
The concepts and terminology are intended as a contribution to theory in the acousmatic medium, relevant to composition, analysis, and listening. The music represents an aesthetic orientation which emphasises materiality and morphology in texture, transformative processes, spatial design, and spatiotemporal polyvalence
Põhiemotsioonid eestikeelses etteloetud kõnes: akustiline analüüs ja modelleerimine
Väitekirja elektrooniline versioon ei sisalda publikatsiooneDoktoritööl oli kaks eesmärki: saada teada, milline on kolme põhiemotsiooni – rõõmu, kurbuse ja viha – akustiline väljendumine eestikeelses etteloetud kõnes, ning luua neile uurimistulemustele tuginedes eestikeelsele kõnesüntesaatorile parameetrilise sünteesi jaoks emotsionaalse kõne akustilised mudelid, mis aitaksid süntesaatoril äratuntavalt nimetatud emotsioone väljendada.
Kuna sünteeskõnet rakendatakse paljudes valdkondades, näiteks inimese ja masina suhtluses, multimeedias või puuetega inimeste abivahendites, siis on väga oluline, et sünteeskõne kõlaks loomulikuna, võimalikult inimese rääkimise moodi. Üks viis sünteeskõne loomulikumaks muuta on lisada sellesse emotsioone, tehes seda mudelite abil, mis annavad süntesaatorile ette emotsioonide väljendamiseks vajalikud akustiliste parameetrite väärtuste kombinatsioonid.
Emotsionaalse kõne mudelite loomiseks peab teadma, kuidas emotsioonid inimkõnes hääleliselt väljenduvad. Selleks tuli uurida, kas, millisel määral ja mis suunas emotsioonid akustiliste parameetrite (näiteks põhitooni, intensiivsuse ja kõnetempo) väärtusi mõjutavad ning millised parameetrid võimaldavad emotsioone üksteisest ja neutraalsest kõnest eristada. Saadud tulemuste põhjal oli võimalik luua emotsioonide akustilisi mudeleid* ning katseisikud hindasid, milliste mudelite järgi on emotsioonid sünteeskõnes äratuntavad. Eksperiment kinnitas, et akustikaanalüüsi tulemustele tuginevate mudelitega suudab eestikeelne kõnesüntesaator rahuldavalt väljendada nii kurbust kui ka viha, kuid mitte rõõmu.
Doktoritöö kajastab üht võimalikku viisi, kuidas rõõm, kurbus ja viha eestikeelses kõnes hääleliselt väljenduvad, ning esitab mudelid, mille abil emotsioone eestikeelsesse sünteeskõnesse lisada. Uurimistöö on lähtepunkt edasisele eestikeelse emotsionaalse sünteeskõne akustiliste mudelite arendamisele.
* Katsemudelite järgi sünteesitud emotsionaalset kõnet saab kuulata aadressil https://www.eki.ee/heli/index.php?option=com_content&view=article&id=7&Itemid=494.The present doctoral dissertation had two major purposes: (a) to find out and describe the acoustic expression of three basic emotions – joy, sadness and anger – in read Estonian speech, and (b) to create, based on the resulting description, acoustic models of emotional speech, designed to help parametric synthesis of Estonian speech recognizably express the above emotions.
As far as synthetic speech has many applications in different fields, such as human-machine interaction, multimedia, or aids for the disabled, it is vital that the synthetic speech should sound natural, that is, as human-like as possible. One of the ways to naturalness lies through adding emotions to the synthetic speech by means of models feeding the synthesiser with combinations of acoustic parametric values necessary for emotional expression.
In order to create such models of emotional speech, it is first necessary to have a detailed knowledge of the vocal expression of emotions in human speech. For that purpose I had to investigate to what extent, if any, and in what direction emotions influence the values of speech acoustic parameters (e.g., fundamental frequency, intensity and speech rate), and which parameters enable discrimination of emotions from each other and from neutral speech. The results provided material for creating acoustic models of emotions* to be presented to evaluators, who were asked to decide which of the models helped to produce synthetic speech with recognisable emotions. The experiment proved that with models based on acoustic results, an Estonian speech synthesiser can satisfactorily express sadness and anger, while joy was not so well recognised by listeners.
This doctoral dissertation describes one of the possible ways for the vocal expression of joy, sadness and anger in Estonian speech and presents some models enabling addition of emotions to Estonian synthetic speech. The study serves as a starting point for the future development of acoustic models for Estonian emotional synthetic speech.
* Recorded examples of emotional speech synthesised using the test models can be accessed at https://www.eki.ee/heli/index.php?option=com_content&view=article&id=7&Itemid=494
Musical timbre: bridging perception with semantics
Musical timbre is a complex and multidimensional entity which provides information regarding
the properties of a sound source (size, material, etc.). When it comes to music, however, timbre
does not merely carry environmental information, but it also conveys aesthetic meaning. In this
sense, semantic description of musical tones is used to express perceptual concepts related to
artistic intention. Recent advances in sound processing and synthesis technology have enabled
the production of unique timbral qualities which cannot be easily associated with a familiar
musical instrument. Therefore, verbal description of these qualities facilitates communication
between musicians, composers, producers, audio engineers etc. The development of a common
semantic framework for musical timbre description could be exploited by intuitive sound synthesis
and processing systems and could even influence the way in which music is being consumed.
This work investigates the relationship between musical timbre perception and its semantics.
A set of listening experiments in which participants from two different language groups (Greek
and English) rated isolated musical tones on semantic scales has tested semantic universality of
musical timbre. The results suggested that the salient semantic dimensions of timbre, namely:
luminance, texture and mass, are indeed largely common between these two languages. The relationship
between semantics and perception was further examined by comparing the previously
identified semantic space with a perceptual timbre space (resulting from pairwise dissimilarity
rating of the same stimuli). The two spaces featured a substantial amount of common variance
suggesting that semantic description can largely capture timbre perception. Additionally, the
acoustic correlates of the semantic and perceptual dimensions were investigated. This work concludes
by introducing the concept of partial timbre through a listening experiment that demonstrates
the influence of background white noise on the perception of musical tones. The results
show that timbre is a relative percept which is influenced by the auditory environment
Effects of hearing loss on attentional effort during listening tasks
The purpose of this study was to determine the effects of listening task difficulty on attentional effort for normal hearing and hearing impaired individuals. The study utilized a dual-task paradigm involving an auditory task (primary task) of listening to speech in noise and a memory task (secondary task) of digit recall. The memory task was used to measure changes in attentional effort as the difficulty of the auditory task was changed.
The results of the study identified two main strategies for listening to speech in noise. The attentional effort strategy used by most subjects in this study was one of borrowing from the memory task as the auditory task became more difficult. The second attentional effort strategy used by a minority of subjects in this study was one of minimal borrowing from the memory task as the auditory task became more difficult. The group using the minimal borrowing strategy only demonstrated borrowing at the easiest auditory task condition. They did not show an increase in attentional effort as the task became more difficult.
Within the two main strategies, three sub-strategies were found to be used when the auditory task became very difficult or impossible. At this point, the majority of those using the borrowing strategy decreased attentional effort to the auditory task. This sub-strategy was labeled as punting. About one-third of those using the borrowing strategy continued to increase attentional effort to the auditory task, even when the auditory task became impossible. This sub-strategy was labeled as trying. One minimal borrowing subject used the punting sub-strategy when the auditory task became impossible. One minimal borrowing subject did not increase attentional effort to the auditory task at any difficulty level. This sub-strategy was labeled as constant attention to memory.
Hearing impaired and normal hearing subjects were about equally likely to use the borrowing strategy. Hearing impaired subjects were slightly more likely to use the punting sub-strategy
Composition portfolio : producing techno grooves
PhD Thesis
Additional music to be consulted in Robinson Library only.PhD submission consisting of a portfolio of recordings presented both in their
published form on five twelve-inch vinyl records and on two compact discs. The
portfolio is accompanied by a commentary intended to facilitate access to the
aesthetic statement presented in the portfolio.
Following Charles Keil’s distinction between ‘embodied meaning’ and ‘engendered
feeling’ this project investigates approaches to the creation of Techno music in
terms of the significance of specific sounds, techniques and technologies in
generating subsyntactic value. The concept of the groove and the importance of
microtiming as they operate in my practice are discussed in the commentary and
demonstrated through the production of a series of Techno records including both
original compositions and remixes
Interacção gestual na música para instrumentos e sons electroacústicos
Doutoramento em MúsicaEsta tese apresenta alguns aspectos em como o fenómeno do gesto musical
pode ser compreendido na percepção da interação musical na música para
instrumentos e sons electroacústicos. Através de exemplos de análise,
classificação e categorização de diferentes relacões gestuais entre
instrumentos e sons electroacústicos, pretende-se estabelecer modelos
específicos de interacção que podem ser aplicados como método analítico
assim como na composição musical. A pesquisa parte de uma variedade de
definições sobre gesto musical na música em geral, na música contemporânea
e na música electroacústica em particular, para subsequentemente incluir as
relações entre dois eventos sonoros com características diferentes - o
electroacústico e o instrumental. São essencialmente abordadas as relações
entre gestos musicais através da análise de diversas características: altura,
ritmo, timbre, dinâmica, características contrapontísticas,
espectromorfológicas, semânticas e espaciais. O resultado da pesquisa
teórica serviu de suporte à composição de diversas obras, onde estes
aspectos são explorados sob o ponto de vista da criação musical.This dissertation presents some aspects how the phenomenon of
musical gesture can be understood in the perception of musical interaction in
music for instruments and electroacoustic sounds. Through analytical
examples, classification and categorization of different kinds of gesture
relationships between instruments and electroacoustic sounds, the aim is to
establish specific models of interaction that can be applied as analytical
method, as well as in composition. This research departs from a variety of
previous approaches to gesture in music in general, and more specifically
contemporary music and electroacoustic music, in order to include the relations
between two sound events with different characteristics - the electroacoustic
and the instrumental. This research focuses on relations between musical
gestures, through the analysis of several characteristics (pitch, rhythm, timbre,
dynamics, contrapuntal, spectromorphologic, semantic and spatial). The result
of theoretical research has served as basis for composition of various works,
where these aspects are explored from the point of view of musical creation.FC
- …