3,696 research outputs found

    On the effect of SNR and superdirective beamforming in speaker diarisation in meetings

    Get PDF
    This paper examines the effect of sensor performance on speaker diarisation in meetings and investigates the use of more advanced beamforming techniques, beyond the typically employed delay-sum beamformer, for mitigating the effects of poorer sensor performance. We present superdirective beamforming and investigate how different time difference of arrival (TDOA) smoothing and beamforming techniques influence the performance of state-of-the-art diarisation systems. We produced and transcribed a new corpus of meetings recorded in the instrumented meeting room using a high SNR analogue and a newly developed low SNR digital MEMS microphone array (DMMA.2). This research demonstrates that TDOA smoothing has a significant effect on the diarisation error rate and that simple noise reduction and beamforming schemes suffice to overcome audio signal degradation due to the lower SNR of modern MEMS microphones. Index Terms — Speaker diarisation in meetings, digital MEMS microphone array, time difference of arrival (TDOA), superdirective beamforming 1

    Towards using web-crawled data for domain adaptation in statistical machine translation

    Get PDF
    This paper reports on the ongoing work focused on domain adaptation of statistical machine translation using domain-speciïŹc data obtained by domain-focused web crawling. We present a strategy for crawling monolingual and parallel data and their exploitation for testing, language modelling, and system tuning in a phrase--based machine translation framework. The proposed approach is evaluated on the domains of Natural Environment and Labour Legislation and two language pairs: English–French and English–Greek

    Pro-active Meeting Assistants: Attention Please!

    Get PDF
    This paper gives an overview of pro-active meeting assistants, what they are and when they can be useful. We explain how to develop such assistants with respect to requirement definitions and elaborate on a set of Wizard of Oz experiments, aiming to find out in which form a meeting assistant should operate to be accepted by participants and whether the meeting effectiveness and efficiency can be improved by an assistant at all. This paper gives an overview of pro-active meeting assistants, what they are and when they can be useful. We explain how to develop such assistants with respect to requirement definitions and elaborate on a set of Wizard of Oz experiments, aiming to find out in which form a meeting assistant should operate to be accepted by participants and whether the meeting effectiveness and efficiency can be improved by an assistant at all

    An Analysis of Rhythmic Staccato-Vocalization Based on Frequency Demodulation for Laughter Detection in Conversational Meetings

    Get PDF
    Human laugh is able to convey various kinds of meanings in human communications. There exists various kinds of human laugh signal, for example: vocalized laugh and non vocalized laugh. Following the theories of psychology, among all the vocalized laugh type, rhythmic staccato-vocalization significantly evokes the positive responses in the interactions. In this paper we attempt to exploit this observation to detect human laugh occurrences, i.e., the laughter, in multiparty conversations from the AMI meeting corpus. First, we separate the high energy frames from speech, leaving out the low energy frames through power spectral density estimation. We borrow the algorithm of rhythm detection from the area of music analysis to use that on the high energy frames. Finally, we detect rhythmic laugh frames, analyzing the candidate rhythmic frames using statistics. This novel approach for detection of `positive' rhythmic human laughter performs better than the standard laughter classification baseline.Comment: 5 pages, 1 figure, conference pape

    Evaluation campaigns and TRECVid

    Get PDF
    The TREC Video Retrieval Evaluation (TRECVid) is an international benchmarking activity to encourage research in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video corpus, automatic detection of a variety of semantic and low-level video features, shot boundary detection and the detection of story boundaries in broadcast TV news. This paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether such campaigns are a good thing or a bad thing. There are arguments for and against these campaigns and we present some of them in the paper concluding that on balance they have had a very positive impact on research progress
    • 

    corecore