467 research outputs found
PERCEVAL
La station PERCEVAL permet de tester jusqu'à 16 sujets simultanément. Elle a été développée pour répondre à la grande demande de test d'intelligibilité.Conçue autour d'un PC dans l'environnement WINDOWS, elle permet de répondre rapidement à des contrats (C. Cavé et al).ANDRÉ, C.; GHIO, A.; CAVÉ, C.; TESTON, B. PERCEVAL: a Computer-Driven System for Experimentation on Auditory and Visual Perception. Proceedings of XVth ICPhS (2003 August 3-9 : Barcelona, Spain). 2003, p.1421-1424.http://hal.archives-ouvertes.fr/hal-00142980GHIO, A.; ANDRÉ, C.; TESTON, B.; CAVÉ, C. PERCEVAL: une station automatisée de tests de PERCeption et d'EVALuation auditive et visuelle. Travaux Interdisciplinaires du Laboratoire Parole et Langage d'Aix-en-Provence (TIPA), vol. 22. 2003, p. 115-133. ISSN 1621-0360.http://hal.archives-ouvertes.fr/hal-0013419
Resource discovery in heterogeneous digital content environments
The concept of 'resource discovery' is central to our understanding of how users explore, navigate, locate and retrieve information resources. This submission for a PhD by Published Works examines a series of 11 related works which explore topics pertaining to resource discovery, each demonstrating heterogeneity in their digital discovery context. The assembled works are prefaced by nine chapters which seek to review and critically analyse the contribution of each work, as well as provide contextualization within the wider body of research literature. A series of conceptual sub-themes is used to organize and structure the works and the accompanying critical commentary. The thesis first begins by examining issues in distributed discovery contexts by studying collection level metadata (CLM), its application in 'information landscaping' techniques, and its relationship to the efficacy of federated item-level search tools. This research narrative continues but expands in the later works and commentary to consider the application of Knowledge Organization Systems (KOS), particularly within Semantic Web and machine interface contexts, with investigations of semantically aware terminology services in distributed discovery. The necessary modelling of data structures to support resource discovery - and its associated functionalities within digital libraries and repositories - is then considered within the novel context of technology-supported curriculum design repositories, where questions of human-computer interaction (HCI) are also examined. The final works studied as part of the thesis are those which investigate and evaluate the efficacy of open repositories in exposing knowledge commons to resource discovery via web search agents. Through the analysis of the collected works it is possible to identify a unifying theory of resource discovery, with the proposed concept of (meta)data alignment described and presented with a visual model. This analysis assists in the identification of a number of research topics worthy of further research; but it also highlights an incremental transition by the present author, from using research to inform the development of technologies designed to support or facilitate resource discovery, particularly at a 'meta' level, to the application of specific technologies to address resource discovery issues in a local context. Despite this variation the research narrative has remained focussed on topics surrounding resource discovery in heterogeneous digital content environments and is noted as having generated a coherent body of work. Separate chapters are used to consider the methodological approaches adopted in each work and the contribution made to research knowledge and professional practice.The concept of 'resource discovery' is central to our understanding of how users explore, navigate, locate and retrieve information resources. This submission for a PhD by Published Works examines a series of 11 related works which explore topics pertaining to resource discovery, each demonstrating heterogeneity in their digital discovery context. The assembled works are prefaced by nine chapters which seek to review and critically analyse the contribution of each work, as well as provide contextualization within the wider body of research literature. A series of conceptual sub-themes is used to organize and structure the works and the accompanying critical commentary. The thesis first begins by examining issues in distributed discovery contexts by studying collection level metadata (CLM), its application in 'information landscaping' techniques, and its relationship to the efficacy of federated item-level search tools. This research narrative continues but expands in the later works and commentary to consider the application of Knowledge Organization Systems (KOS), particularly within Semantic Web and machine interface contexts, with investigations of semantically aware terminology services in distributed discovery. The necessary modelling of data structures to support resource discovery - and its associated functionalities within digital libraries and repositories - is then considered within the novel context of technology-supported curriculum design repositories, where questions of human-computer interaction (HCI) are also examined. The final works studied as part of the thesis are those which investigate and evaluate the efficacy of open repositories in exposing knowledge commons to resource discovery via web search agents. Through the analysis of the collected works it is possible to identify a unifying theory of resource discovery, with the proposed concept of (meta)data alignment described and presented with a visual model. This analysis assists in the identification of a number of research topics worthy of further research; but it also highlights an incremental transition by the present author, from using research to inform the development of technologies designed to support or facilitate resource discovery, particularly at a 'meta' level, to the application of specific technologies to address resource discovery issues in a local context. Despite this variation the research narrative has remained focussed on topics surrounding resource discovery in heterogeneous digital content environments and is noted as having generated a coherent body of work. Separate chapters are used to consider the methodological approaches adopted in each work and the contribution made to research knowledge and professional practice
Concatenative speech synthesis: a Framework for Reducing Perceived Distortion when using the TD-PSOLA Algorithm
This thesis presents the design and evaluation of an approach to concatenative speech synthesis using the Titne-Domain Pitch-Synchronous OverLap-Add (I'D-PSOLA) signal processing algorithm. Concatenative synthesis systems make use of pre-recorded speech segments stored in a speech corpus. At synthesis time, the `best' segments available to synthesise the new utterances are chosen from the corpus using a process known as unit selection. During the synthesis process, the pitch and duration of these segments may be modified to generate the desired prosody. The
TD-PSOLA algorithm provides an efficient and essentially successful solution to perform these modifications, although some perceptible distortion, in the form of `buzzyness', may be introduced into the speech signal.
Despite the popularity of the TD-PSOLA algorithm, little formal research has been undertaken to address this recognised problem of distortion. The approach in the thesis has been developed towards reducing the perceived distortion that is introduced when TD-PSOLA is applied to
speech. To investigate the occurrence of this distortion, a psychoacoustic evaluation of the effect of pitch modification using the TD-PSOLA algorithm is presented. Subjective experiments in the form of a set of listening tests were undertaken using word-level stimuli that had been manipulated using TD-PSOLA. The data collected from these experiments were analysed for patterns of co-
occurrence or correlations to investigate where this distortion may occur. From this, parameters were identified which may have contributed to increased distortion. These
parameters were concerned with the relationship between the spectral content of individual phonemes, the extent of pitch manipulation, and aspects of the original recordings.
Based on these results, a framework was designed for use in conjunction with TD-PSOLA to minimise the possible causes of distortion. The framework consisted of a novel speech corpus design, a signal processing distortion measure, and a selection process for especially problematic phonemes. Rather than phonetically balanced, the corpus is balanced to the needs of the signal processing algorithm, containing more of the adversely affected phonemes. The aim is to reduce the potential extent of pitch modification of such segments, and hence produce synthetic speech with less perceptible distortion. The signal processingdistortion measure was developed to allow the prediction of perceptible distortion in pitch-modified speech. Different weightings were estimated for individual phonemes,trained using the experimental data collected during the listening tests.The potential benefit of such a measure for existing unit selection processes in a corpus-based system using
TD-PSOLA is illustrated. Finally, the special-case selection process was developed for highly problematic voiced fricative phonemes to minimise the occurrence of perceived distortion in these segments. The success of the framework, in terms of generating synthetic speech with reduced distortion, was evaluated. A listening test showed that the TD-PSOLA balanced speech corpus may be capable of generating pitch-modified synthetic sentences with significantly less distortion than those generated using a typical phonetically balanced corpus. The voiced fricative selection process was also shown to produce pitch-modified versions of these phonemes with less perceived distortion than a standard selection process. The listening test then indicated that the signal processing distortion measure was able to predict the resulting amount of distortion at the
sentence-level after the application of TD-PSOLA, suggesting that it may be beneficial to include such a measure in existing unit selection processes. The framework was found to be capable of producing speech with reduced perceptible distortion in certain situations, although the effects seen at the sentence-level were less than those seen in the previous investigative experiments that made use of word-level stimuli. This suggeststhat the effect of the TD-PSOLA algorithm cannot always be easily anticipated due to the highly dynamic nature of speech, and that the reduction of perceptible distortion in TD-PSOLA-modified speech remains a challenge to the speech community
Towards Reliable and Inclusive Natural Language Generation
Natural language generation (NLG) is an important subfield of natural language processing (NLP) that produces natural language output. Despite notable advancements made by large-scale pre-trained language models in NLG, there remain several unresolved challenges. This thesis aims to enhance NLG from two significant aspects: reliability and inclusiveness. For reliability, on the one hand, we introduce novel training objectives that improve the alignment of language generation models with desired model behaviors. To improve the answerability of model-generated questions, we use a question answering model to provide additional rewards to a question generation model, encouraging the production of more answerable questions. In addition, we propose to train language models with a mixture of forward and reverse cross-entropies, demonstrating that the resulting models yield better generated text without complex decoding strategies. On the other hand, we propose novel evaluation methods to assess the performance of NLG models accurately and comprehensively. By combining human and automatic evaluations, we strike a balance between reliability and reproducibility. We delve into the unexplored issue of unfaithfulness in extractive summaries and conclude that extractive summarization does not guarantee faithfulness. For inclusiveness, we extend the coverage of NLG techniques to low-resource or endangered languages. We develop the first machine translation system for supporting translation between Cherokee, an endangered Native American language, and English, and we propose a roadmap for utilizing NLP to support language revitalization efforts. Additionally, we investigate the underrepresentation of low-resource languages during multilingual tokenization, a crucial data preprocessing step in training multilingual NLG models, and we present best practices for training multilingual tokenizers. Overall, this thesis works towards enhancing the trustworthiness of NLG models in practice and facilitating support for a more diverse range of languages worldwide.Doctor of Philosoph
Negative vaccine voices in Swedish social media
Vaccinations are one of the most significant interventions to public health, but vaccine hesitancy creates concerns for a portion of the population in many countries, including Sweden. Since discussions on vaccine hesitancy are often taken on social networking sites, data from Swedish social media are used to study and quantify the sentiment among the discussants on the vaccination-or-not topic during phases of the COVID-19 pandemic. Out of all the posts analyzed a majority showed a stronger negative sentiment, prevailing throughout the whole of the examined period, with some spikes or jumps due to the occurrence of certain vaccine-related events distinguishable in the results. Sentiment analysis can be a valuable tool to track public opinions regarding the use, efficacy, safety, and importance of vaccination
Concatenative speech synthesis : a framework for reducing perceived distortion when using the TD-PSOLA algorithm
This thesis presents the design and evaluation of an approach to concatenative speech synthesis using the Titne-Domain Pitch-Synchronous OverLap-Add (I'D-PSOLA) signal processing algorithm. Concatenative synthesis systems make use of pre-recorded speech segments stored in a speech corpus. At synthesis time, the `best' segments available to synthesise the new utterances are chosen from the corpus using a process known as unit selection. During the synthesis process, the pitch and duration of these segments may be modified to generate the desired prosody. The TD-PSOLA algorithm provides an efficient and essentially successful solution to perform these modifications, although some perceptible distortion, in the form of `buzzyness', may be introduced into the speech signal. Despite the popularity of the TD-PSOLA algorithm, little formal research has been undertaken to address this recognised problem of distortion. The approach in the thesis has been developed towards reducing the perceived distortion that is introduced when TD-PSOLA is applied to speech. To investigate the occurrence of this distortion, a psychoacoustic evaluation of the effect of pitch modification using the TD-PSOLA algorithm is presented. Subjective experiments in the form of a set of listening tests were undertaken using word-level stimuli that had been manipulated using TD-PSOLA. The data collected from these experiments were analysed for patterns of co- occurrence or correlations to investigate where this distortion may occur. From this, parameters were identified which may have contributed to increased distortion. These parameters were concerned with the relationship between the spectral content of individual phonemes, the extent of pitch manipulation, and aspects of the original recordings. Based on these results, a framework was designed for use in conjunction with TD-PSOLA to minimise the possible causes of distortion. The framework consisted of a novel speech corpus design, a signal processing distortion measure, and a selection process for especially problematic phonemes. Rather than phonetically balanced, the corpus is balanced to the needs of the signal processing algorithm, containing more of the adversely affected phonemes. The aim is to reduce the potential extent of pitch modification of such segments, and hence produce synthetic speech with less perceptible distortion. The signal processingdistortion measure was developed to allow the prediction of perceptible distortion in pitch-modified speech. Different weightings were estimated for individual phonemes,trained using the experimental data collected during the listening tests.The potential benefit of such a measure for existing unit selection processes in a corpus-based system using TD-PSOLA is illustrated. Finally, the special-case selection process was developed for highly problematic voiced fricative phonemes to minimise the occurrence of perceived distortion in these segments. The success of the framework, in terms of generating synthetic speech with reduced distortion, was evaluated. A listening test showed that the TD-PSOLA balanced speech corpus may be capable of generating pitch-modified synthetic sentences with significantly less distortion than those generated using a typical phonetically balanced corpus. The voiced fricative selection process was also shown to produce pitch-modified versions of these phonemes with less perceived distortion than a standard selection process. The listening test then indicated that the signal processing distortion measure was able to predict the resulting amount of distortion at the sentence-level after the application of TD-PSOLA, suggesting that it may be beneficial to include such a measure in existing unit selection processes. The framework was found to be capable of producing speech with reduced perceptible distortion in certain situations, although the effects seen at the sentence-level were less than those seen in the previous investigative experiments that made use of word-level stimuli. This suggeststhat the effect of the TD-PSOLA algorithm cannot always be easily anticipated due to the highly dynamic nature of speech, and that the reduction of perceptible distortion in TD-PSOLA-modified speech remains a challenge to the speech community.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Proceedings of the 17th Annual Conference of the European Association for Machine Translation
Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
Products and Services
Today’s global economy offers more opportunities, but is also more complex and competitive than ever before. This fact leads to a wide range of research activity in different fields of interest, especially in the so-called high-tech sectors. This book is a result of widespread research and development activity from many researchers worldwide, covering the aspects of development activities in general, as well as various aspects of the practical application of knowledge
- …