112 research outputs found

    Learning the hidden structure of speech: from communicative functions to prosody

    Get PDF
    Este artigo introduz um novo método, orientado via modelamento e via interação com dados comportamentais, para gerar padrões prosódicos a partir de informação metalingüística. Referimos aqui à habilidade geral da entoação de demarcar unidades de fala e veicular informação sobre as funções proposicional e interacional dessas unidades no discurso. Nossas hipóteses fortes são que (1) essas funções são diretamente implementadas como contornos prosódicos prototípicos que são co-extensivos às unidades para as quais eles se aplicam, (2) o padrão prosódico da mensagem é obtido ao superpor e adicionar todos os contornos elementares (Aubergé & Bailly, 1995). Descrevemos aqui um esquema de análise por síntese que consiste em identificar esses contornos prototípicos e separar suas contribuições respectivas nos contornos prosódicos dos dados de treinamento. O esquema é aplicado a bases de dados designadas para evidenciar várias funções entoacionais. Resultados experimentais mostram que o modelo gera contornos prosódicos adequados com pouquíssimos movimentos prototípicos

    Intonation in a text-to-speech conversion system

    Get PDF

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science. There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science. This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible

    Suivi temporel de stimuli dynamiques interférants par marquage du plan temps-fréquence utilisant une statistique de passages par zéro

    Get PDF
    Dans un cadre d'Analyse de Scènes Auditives Computationnelle (CASA), ce papier présente un modèle de marquage du plan temps-fréquence par détection d'harmonicité. L'originalité du modèle tient à l'exploitation d'une statistique des passages par zéros du signal temporel pour le marquage, statistique qui fournit une mesure de la fiabilité du marquage par le biais de l'écart-type des longueurs d'intervalles inter-zéros du premier ordre. Après avoir présenté le modèle et son comportement, nous montrons que celui-ci peut-être utilisé pour le suivi de stimuli dynamiques présentant de fortes variations prosodiques

    Decorative Timbre: Integrating characteristics of Spectral and Dastgah music

    Get PDF
    Decorative Timbre is a portfolio of original compositions and an accompanying written dissertation. In this thesis, I propose a new musical language synthesising the expressive element of Western Spectral and Persian Dastgah music via the marriage of timbre and ornamentation. Persian and Spectral music are two fundamentally distinct musical approaches derived from different philosophies and traditions, each possessing a particular value and aesthetic. However, in researching mutual characteristics and modalities, I draw connections between the two forms of music under the concept of decorative timbre. I discuss approaches to 'converting a melody to timbre and vice versa' and offer a new compositional technique of 'excessive multilayering' that is inspired by shared commonalities in both traditions. The portfolio comprises four works that explore the application of excessive multilayering; Abalfazl, War is Peace, Let me Tune, and Beautifully Untuned Mind. The centrepiece of my creative portfolio, Panbe Zan (the cotton beater), is an experimental electroacoustic opera that recreates and recontextualizes the forgotten sounds of an obsolete profession 'Panbe Zani (Cotton Beating).' Featuring a redesigned bow-shaped instrument together with live musicians, pre-recorded and manipulated sounds, and staging, the work portrays this nostalgic scene in a modern context

    Negative vaccine voices in Swedish social media

    Get PDF
    Vaccinations are one of the most significant interventions to public health, but vaccine hesitancy creates concerns for a portion of the population in many countries, including Sweden. Since discussions on vaccine hesitancy are often taken on social networking sites, data from Swedish social media are used to study and quantify the sentiment among the discussants on the vaccination-or-not topic during phases of the COVID-19 pandemic. Out of all the posts analyzed a majority showed a stronger negative sentiment, prevailing throughout the whole of the examined period, with some spikes or jumps due to the occurrence of certain vaccine-related events distinguishable in the results. Sentiment analysis can be a valuable tool to track public opinions regarding the use, efficacy, safety, and importance of vaccination

    Jeddah Arabic intonation : an autosegmental-metrical approach

    Get PDF
    IPhD ThesisThis thesis is a theoretical and instrumental investigation of intonation in Jeddah Arabic, an urban Arabic variety spoken in west Saudi Arabia. The study is carried out in an attempt to establish the dialect’s prosodic properties and to widen the scope and volume of the literature on Arabic prosody that would in turn aid in the cross-dialectal comparison of prosodic and intonational patterns. The investigation is carried out in light of the Auto-Segmental Metrical theory of intonation- a theory that has been reported to account for the intonational patterns of many languages. In AM theory, intonation is manifested via prominent F0 behaviour in interaction with phonological structure, hence maintains a close relationship between accent distribution and phonological/metrical structure. This F0 behaviour is examined acoustically through pitch level, range and excursion size, in the form of increased peak height and excursion, pitch compression or absence thereof to mark intonational structure. In addition to pitch, other acoustic correlates such as duration and amplitude are examined as well. The thesis includes the examination of the different tunes, postlexical phrasing, and accent categories (contour shapes) that occur in the dialect. Moreover, and as an integral part of AM analysis, the thesis closely examines both theoretically and acoustically the concepts of tonal alignment and accentuation and information structure in this Arabic dialect. Data for the study were collected from 20 native male and female speakers of Jeddah Arabic. Data were then semiautomatically segmented and manually transcribed using a modified TOBI system for Arabic. It is found that JA speakers rely on both qualitative and quantitative detail to enhance intonationally important material that is conveyed prosodically. The results also point to that JA is a stress-accent language that is although similar to other languages in this group, contributes differently to the general cross-language prosodic variation. The dialect demonstrates prominent pitch accents that faithfully associate and align with stressed syllables and are distributed in two intonational levels above the prosodic word: the intermediate phrase and the intonational phrase. Those two intonational levels are found to be marked by both tonal and non-tonal correlates. Experimental evidence shows that contrary to the typical reported correlates of those prosodic constituents, in JA intermediate phrases boundaries demonstrate longer pre-boundary units than intonational phrases. This non-tonal pattern in intermediate phrase boundaries correlates with later alignment of the tone with respect to the onset of the stressed syllable

    The Perception of Emotion from Acoustic Cues in Natural Speech

    Get PDF
    Knowledge of human perception of emotional speech is imperative for the development of emotion in speech recognition systems and emotional speech synthesis. Owing to the fact that there is a growing trend towards research on spontaneous, real-life data, the aim of the present thesis is to examine human perception of emotion in naturalistic speech. Although there are many available emotional speech corpora, most contain simulated expressions. Therefore, there remains a compelling need to obtain naturalistic speech corpora that are appropriate and freely available for research. In that regard, our initial aim was to acquire suitable naturalistic material and examine its emotional content based on listener perceptions. A web-based listening tool was developed to accumulate ratings based on large-scale listening groups. The emotional content present in the speech material was demonstrated by performing perception tests on conveyed levels of Activation and Evaluation. As a result, labels were determined that signified the emotional content, and thus contribute to the construction of a naturalistic emotional speech corpus. In line with the literature, the ratings obtained from the perception tests suggested that Evaluation (or hedonic valence) is not identified as reliably as Activation is. Emotional valence can be conveyed through both semantic and prosodic information, for which the meaning of one may serve to facilitate, modify, or conflict with the meaning of the other—particularly with naturalistic speech. The subsequent experiments aimed to investigate this concept by comparing ratings from perception tests of non-verbal speech with verbal speech. The method used to render non-verbal speech was low-pass filtering, and for this, suitable filtering conditions were determined by carrying out preliminary perception tests. The results suggested that nonverbal naturalistic speech provides sufficiently discernible levels of Activation and Evaluation. It appears that the perception of Activation and Evaluation is affected by low-pass filtering, but that the effect is relatively small. Moreover, the results suggest that there is a similar trend in agreement levels between verbal and non-verbal speech. To date it still remains difficult to determine unique acoustical patterns for hedonic valence of emotion, which may be due to inadequate labels or the incorrect selection of acoustic parameters. This study has implications for the labelling of emotional speech data and the determination of salient acoustic correlates of emotion
    corecore