3,696 research outputs found
On the effect of SNR and superdirective beamforming in speaker diarisation in meetings
This paper examines the effect of sensor performance on speaker diarisation in meetings and investigates the use of more advanced beamforming techniques, beyond the typically employed delay-sum beamformer, for mitigating the effects of poorer sensor performance. We present superdirective beamforming and investigate how different time difference of arrival (TDOA) smoothing and beamforming techniques influence the performance of state-of-the-art diarisation systems. We produced and transcribed a new corpus of meetings recorded in the instrumented meeting room using a high SNR analogue and a newly developed low SNR digital MEMS microphone array (DMMA.2). This research demonstrates that TDOA smoothing has a significant effect on the diarisation error rate and that simple noise reduction and beamforming schemes suffice to overcome audio signal degradation due to the lower SNR of modern MEMS microphones. Index Terms â Speaker diarisation in meetings, digital MEMS microphone array, time difference of arrival (TDOA), superdirective beamforming 1
Towards using web-crawled data for domain adaptation in statistical machine translation
This paper reports on the ongoing work focused on domain adaptation of statistical machine translation using domain-speciïŹc data obtained by domain-focused web crawling. We present a strategy for crawling monolingual and parallel data and their exploitation for testing, language modelling, and system tuning in a phrase--based machine translation framework. The proposed approach is evaluated on the domains of Natural Environment and Labour Legislation and two language
pairs: EnglishâFrench and EnglishâGreek
Pro-active Meeting Assistants: Attention Please!
This paper gives an overview of pro-active meeting assistants, what they are and when they can be useful. We explain how to develop such assistants with respect to requirement definitions and elaborate on a set of Wizard of Oz experiments, aiming to find out in which form a meeting assistant should operate to be accepted by participants and whether the meeting effectiveness and efficiency can be improved by an assistant at all. This paper gives an overview of pro-active meeting assistants, what they are and when they can be useful. We explain how to develop such assistants with respect to requirement definitions and elaborate on a set of Wizard of Oz experiments, aiming to find out in which form a meeting assistant should operate to be accepted by participants and whether the meeting effectiveness and efficiency can be improved by an assistant at all
An Analysis of Rhythmic Staccato-Vocalization Based on Frequency Demodulation for Laughter Detection in Conversational Meetings
Human laugh is able to convey various kinds of meanings in human
communications. There exists various kinds of human laugh signal, for example:
vocalized laugh and non vocalized laugh. Following the theories of psychology,
among all the vocalized laugh type, rhythmic staccato-vocalization
significantly evokes the positive responses in the interactions. In this paper
we attempt to exploit this observation to detect human laugh occurrences, i.e.,
the laughter, in multiparty conversations from the AMI meeting corpus. First,
we separate the high energy frames from speech, leaving out the low energy
frames through power spectral density estimation. We borrow the algorithm of
rhythm detection from the area of music analysis to use that on the high energy
frames. Finally, we detect rhythmic laugh frames, analyzing the candidate
rhythmic frames using statistics. This novel approach for detection of
`positive' rhythmic human laughter performs better than the standard laughter
classification baseline.Comment: 5 pages, 1 figure, conference pape
Evaluation campaigns and TRECVid
The TREC Video Retrieval Evaluation (TRECVid) is an
international benchmarking activity to encourage research
in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video
corpus, automatic detection of a variety of semantic and
low-level video features, shot boundary detection and the
detection of story boundaries in broadcast TV news. This
paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether
such campaigns are a good thing or a bad thing. There are
arguments for and against these campaigns and we present
some of them in the paper concluding that on balance they
have had a very positive impact on research progress
- âŠ