107 research outputs found

    Robust language recognition via adaptive language factor extraction

    Get PDF
    This paper presents a technique to adapt an acoustically based language classifier to the background conditions and speaker accents. This adaptation improves language classification on a broad spectrum of TV broadcasts. The core of the system consists of an iVector-based setup in which language and channel variabilities are modeled separately. The subsequent language classifier (the backend) operates on the language factors, i.e. those features in the extracted iVectors that explain the observed language variability. The proposed technique adapts the language variability model to the background conditions and to the speaker accents present in the audio. The effect of the adaptation is evaluated on a 28 hours corpus composed of documentaries and monolingual as well as multilingual broadcast news shows. Consistent improvements in the automatic identification of Flemish (Belgian Dutch), English and French are demonstrated for all broadcast types

    Measurement-based analysis of delay-Doppler characteristics in an indoor environment

    Get PDF
    An analysis of delay-Doppler characteristics in the presence of moving people is presented for short-range communication in an indoor environment. Channel-sounding measurements have been carried out at 3.6 GHz in a crowded university hall during several short and long breaks in-between courses. During three consecutive days, the measurements were repeated with different positions for the transmit and receive antennas. In this study, the behavior of the maximum Doppler shift and the Doppler spread was analyzed in the time-delay domain as a function of the occupation of the hall, the polarizations of the 2 x 2 MIMO antennas, and their positions in the hall. The measurements reveal a clear distinction between the Doppler spread of the short and long breaks in the campaign, indicating a distinctive power distribution of their Doppler spectra. In addition, there is a significant contrast between the Doppler characteristics of the co- and cross-polarizations. Measurements at several positions reveal the importance of characterizing multipaths and show that the Doppler effect depends on the position of the antennas in the environment. In addition, this work also shows that the Doppler spectrum can be accurately modeled by a Cauchy distribution, allowing for the generation of parameters to describe Doppler characteristics

    Measurement-based analysis of Doppler characteristics for ultra-wideband radio channels in an office environment

    Get PDF
    In this work, an analysis of the Doppler characteristics for Ultra-Wideband indoor communication is presented. Channel sounding measurements ranging from 3.1 to 10.6 GHz were performed over the course of several days in an occupied office environment, with the help of a network analyzer. Based on these measurements, we analyze the behavior of both the Doppler spread and RMS Doppler spread in the Ultra-Wideband frequency band. Our measurements indicate a frequency-dependent behavior for both parameters, where consistent values could be measured with respect to time of observation

    A hybrid genetic algorithm for solving a layout problem in the fashion industry.

    Get PDF
    As of this writing, many success stories exist yet of powerful genetic algorithms (GAs) in the field of constraint optimisation. In this paper, a hybrid, intelligent genetic algorithm will be developed for solving a cutting layout problem in the Belgian fashion industry. In an initial section, an existing LP formulation of the cutting problem is briefly summarised and is used in further paragraphs as the core design of our GA. Through an initial attempt of rendering the algorithm as universal as possible, it was conceived a threefold genetic enhancement had to be carried out that reduces the size of the active solution space. The GA is therefore rebuilt using intelligent genetic operators, carrying out a local optimisation and applying a heuristic feasibility operator. Powerful computational results are achieved for a variety of problem cases that outperform any existing LP model yet developed.Fashion; Industry;

    Adaptive speaker diarization of broadcast news based on factor analysis

    Get PDF
    The introduction of factor analysis techniques in a speaker diarization system enhances its performance by facilitating the use of speaker specific information, by improving the suppression of nuisance factors such as phonetic content, and by facilitating various forms of adaptation. This paper describes a state-of-the-art iVector-based diarization system which employs factor analysis and adaptation on all levels. The diarization modules relevant for this work are: the speaker segmentation which searches for speaker boundaries and the speaker clustering which aims at grouping speech segments of the same speaker. The speaker segmentation relies on speaker factors which are extracted on a frame-by-frame basis using eigenvoices. We incorporate soft voice activity detection in this extraction process as the speaker change detection should be based on speaker information only and we want it to disregard the non-speech frames by applying speech posteriors. Potential speaker boundaries are inserted at positions where rapid changes in speaker factors are witnessed. By employing Mahalanobis distances, the effect of the phonetic content can be further reduced, which results in more accurate speaker boundaries. This iVector-based segmentation significantly outperforms more common segmentation methods based on the Bayesian Information Criterion (BIC) or speech activity marks. The speaker clustering employs two-step Agglomerative Hierarchical Clustering (AHC): after initial BIC clustering, the second cluster stage is realized by either an iVector Probabilistic Linear Discriminant Analysis (PLDA) system or Cosine Distance Scoring (CDS) of extracted speaker factors. The segmentation system is made adaptive on a file-by-file basis by iterating the diarization process using eigenvoice matrices adapted (unsupervised) on the output of the previous iteration. Assuming that for most use cases material similar to the recording in question is readily available, unsupervised domain adaptation of the speaker clustering is possible as well. We obtain this by expanding the eigenvoice matrix used during speaker factor extraction for the CDS clustering stage with a small set of new eigenvoices that, in combination with the initial generic eigenvoices, models the recurring speakers and acoustic conditions more accurately. Experiments on the COST278 multilingual broadcast news database show the generation of significantly more accurate speaker boundaries by using adaptive speaker segmentation which also results in more accurate clustering. The obtained speaker error rate (SER) can be further reduced by another 13% relative to 7.4% via domain adaptation of the CDS clustering. (C) 2017 Elsevier Ltd. All rights reserved

    Factor analysis for speaker segmentation and improved speaker diarization

    Get PDF
    Speaker diarization includes two steps: speaker segmentation and speaker clustering. Speaker segmentation searches for speaker boundaries, whereas speaker clustering aims at grouping speech segments of the same speaker. In this work, the segmentation is improved by replacing the Bayesian Information Criterion (BIC) with a new iVector-based approach. Unlike BIC-based methods which trigger on any acoustic dissimilarities, the proposed method suppresses phonetic variations and accentuates speaker differences. More specifically our method generates boundaries based on the distance between two speaker factor vectors that are extracted on a frame-by frame basis. The extraction relies on an eigenvoice matrix so that large differences between speaker factor vectors indicate a different speaker. A Mahalanobis-based distance measure, in which the covariance matrix compensates for the remaining and detrimental phonetic variability, is shown to generate accurate boundaries. The detected segments are clustered by a state-of-the-art iVector Probabilistic Linear Discriminant Analysis system. Experiments on the COST278 multilingual broadcast news database show relative reductions of 50% in boundary detection errors. The speaker error rate is reduced by 8% relative

    Polarization properties of specular and dense multipath components in a large industrial hall

    Get PDF
    This paper presents an analysis of the polarization characteristics of specular and dense multipath components (SMC & DMC) in a large industrial hall based on frequency-domain channel sounding experiments at 1.3 GHz with 22 MHz bandwidth. The RiMAX maximum-likelihood estimator is used to extract the full polarimetric SMC and DMC from the measurement data by taking into account the polarimetric radiating patterns of the dual-polarized antennas. Cross-polar discrimination (XPD) values are presented for the measured channels and for the SMC and DMC separately
    • …
    corecore