Search CORE

1,522 research outputs found

The Sheffield Wargames Corpus.

Author: Fox C.W.
Hain T.
Liu Y.
Zwyssig E.
Publication venue
Publication date: 01/01/2013
Field of study

Recognition of speech in natural environments is a challenging task, even more so if this involves conversations between sev-eral speakers. Work on meeting recognition has addressed some of the significant challenges, mostly targeting formal, business style meetings where people are mostly in a static position in a room. Only limited data is available that contains high qual-ity near and far field data from real interactions between par-ticipants. In this paper we present a new corpus for research on speech recognition, speaker tracking and diarisation, based on recordings of native speakers of English playing a table-top wargame. The Sheffield Wargames Corpus comprises 7 hours of data from 10 recording sessions, obtained from 96 micro-phones, 3 video cameras and, most importantly, 3D location data provided by a sensor tracking system. The corpus repre-sents a unique resource, that provides for the first time location tracks (1.3Hz) of speakers that are constantly moving and talk-ing. The corpus is available for research purposes, and includes annotated development and evaluation test sets. Baseline results for close-talking and far field sets are included in this paper. 1

CiteSeerX

Edinburgh Research Explorer

White Rose Research Online

Requirements for tracking radar for falling spheres

Author: Brockman W. E.
Hain J. L.
Publication venue
Publication date
Field of study

Error analysis on radar tracking of falling sphere

NASA Technical Reports Server

Source-filter Separation of Speech Signal in the Phase Domain

Author: Barker J.
Hain T.
Loweimi E.
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 06/09/2015
Field of study

Deconvolution of the speech excitation (source) and vocal tract (filter) components through log-magnitude spectral processing is well-established and has led to the well-known cepstral features used in a multitude of speech processing tasks. This paper presents a novel source-filter decomposition based on processing in the phase domain. We show that separation between source and filter in the log-magnitude spectra is far from perfect, leading to loss of vital vocal tract information. It is demonstrated that the same task can be better performed by trend and fluctuation analysis of the phase spectrum of the minimum-phase component of speech, which can be computed via the Hilbert transform. Trend and fluctuation can be separated through low-pass filtering of the phase, using additivity of vocal tract and source in the phase domain. This results in separated signals which have a clear relation to the vocal tract and excitation components. The effectiveness of the method is put to test in a speech recognition task. The vocal tract component extracted in this way is used as the basis of a feature extraction algorithm for speech recognition on the Aurora-2 database. The recognition results shows upto 8.5% absolute improvement in comparison with MFCC features on average (0-20dB)

White Rose Research Online

Learning temporal clusters using capsule routing for speech emotion recognition

Author: Hain T.
Jalal M.A.
Loweimi E.
Moore R.K.
Publication venue: 'International Speech Communication Association'
Publication date: 15/09/2019
Field of study

Emotion recognition from speech plays a significant role in adding emotional intelligence to machines and making human-machine interaction more natural. One of the key challenges from machine learning standpoint is to extract patterns which bear maximum correlation with the emotion information encoded in this signal while being as insensitive as possible to other types of information carried by speech. In this paper, we propose a novel temporal modelling framework for robust emotion classification using bidirectional long short-term memory network (BLSTM), CNN and Capsule networks. The BLSTM deals with the temporal dynamics of the speech signal by effectively representing forward/backward contextual information while the CNN along with the dynamic routing of the Capsule net learn temporal clusters which altogether provide a state-of-the-art technique for classifying the extracted patterns. The proposed approach was compared with a wide range of architectures on the FAU-Aibo and RAVDESS corpora and remarkable gain over state-of-the-art systems were obtained. For FAO-Aibo and RAVDESS 77.6% and 56.2% accuracy was achieved, respectively, which is 3% and 14% (absolute) higher than the best-reported result for the respective tasks

Crossref

White Rose Research Online

Klaus Betz über Karl-E. Hain: Rundfunkfreiheit und Rundfunkordnung

Author: Hain Karl-E.
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/1994
Field of study

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Low-temperature statistical mechanics of the QuanTizer problem: fast quenching and equilibrium cooling of the three-dimensional Voronoi Liquid

Author: Hain Tobias M.
Klatt Michael A.
Schröder-Turk Gerd E.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2020
Field of study

The Quantizer problem is a tessellation optimisation problem where point configurations are identified such that the Voronoi cells minimise the second moment of the volume distribution. While the ground state (optimal state) in 3D is almost certainly the body-centered cubic lattice, disordered and effectively hyperuniform states with energies very close to the ground state exist that result as stable states in an evolution through the geometric Lloyd's algorithm [Klatt et al. Nat. Commun., 10, 811 (2019)]. When considered as a statistical mechanics problem at finite temperature, the same system has been termed the 'Voronoi Liquid' by [Ruscher et al. EPL 112, 66003 (2015)]. Here we investigate the cooling behaviour of the Voronoi liquid with a particular view to the stability of the effectively hyperuniform disordered state. As a confirmation of the results by Ruscher et al., we observe, by both molecular dynamics and Monte Carlo simulations, that upon slow quasi-static equilibrium cooling, the Voronoi liquid crystallises from a disordered configuration into the body-centered cubic configuration. By contrast, upon sufficiently fast non-equilibrium cooling (and not just in the limit of a maximally fast quench) the Voronoi liquid adopts similar states as the effectively hyperuniform inherent structures identified by Klatt et al. and prevents the ordering transition into a BCC ordered structure. This result is in line with the geometric intuition that the geometric Lloyd's algorithm corresponds to a type of fast quench.Comment: 11 pages, 6 figure

arXiv.org e-Print Archive

Copenhagen University Research Information System

Research Repository

The impact of atmospheric pCO2 on carbon isotope ratios of the atmosphere and ocean

Author: Bianchi D.
Galbraith E.D.
Hain M.P.
Kwon E.-Y.
Sarmiento J.L.
Publication venue: 'Wiley'
Publication date: 01/03/2015
Field of study

It is well known that the equilibration timescale for the isotopic ratios 13C/12C and 14C/12C in the ocean mixed layer is on the order of a decade, 2 orders of magnitude slower than for oxygen. Less widely appreciated is the fact that the equilibration timescale is quite sensitive to the speciation of dissolved inorganic carbon (DIC) in the mixed layer, scaling linearly with the ratio DIC/CO2, which varies inversely with atmospheric pCO2. Although this effect is included in models that resolve the role of carbon speciation in air-sea exchange, its role is often unrecognized, and it is not commonly considered in the interpretation of carbon isotope observations. Here we use a global three-dimensional ocean model to estimate the redistribution of the carbon isotopic ratios between the atmosphere and ocean due solely to variations in atmospheric pCO2. Under Last Glacial Maximum (LGM) pCO2, atmospheric Δ14C is increased by ~30‰ due to the speciation change, all else being equal, raising the surface reservoir age by about 250 years throughout most of the ocean. For 13C, enhanced surface disequilibrium under LGM pCO2 causes the upper ocean, atmosphere, and North Atlantic Deep Water δ13C to become at least 0.2‰ higher relative to deep waters ventilated by the Southern Ocean. Conversely, under high pCO2, rapid equilibration greatly decreases isotopic disequilibrium. As a result, during geological periods of high pCO2, vertical δ13C gradients may have been greatly weakened as a direct chemical consequence of the high pCO2, masquerading as very well ventilated or biologically dead Strangelove Oceans. The ongoing anthropogenic rise of pCO2 is accelerating the equilibration of the carbon isotopes in the ocean, lowering atmospheric Δ14C and weakening δ13C gradients within the ocean to a degree that is similar to the traditional fossil fuel “Suess” effect

Southampton (e-Prints Soton)

Computer Mapping of Seasonal Groundwater Fluctuations for Two Differing Southern New Jersey Swamp Forests I

Author: Hain Daniel C.
Maurer John R.
Parrott William R.
Reynolds Phillip E.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/1981
Field of study

Computer-generated maps (SYMAP, Harvard) of seasonal groundwater fluctuations for two New Jersey swamp forests, a red maple (Acer rubrum) swamp and an Atlantic white cedar (Chamaecyparis thyoides) swamp, are presented. Notable differences exist in water table behavior for the two swamp forests and are best accounted for by topographic differences. Other factors examined which might affect the hydrologic differences include vegetation and subsurface geologic differences

Purdue E-Pubs

On the usefulness of the speech phase spectrum for pitch extraction

Author: Barker J.
Hain T.
Loweimi E.
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2018
Field of study

© 2018 International Speech Communication Association. All rights reserved. Most frequency domain techniques for pitch extraction such as cepstrum, harmonic product spectrum (HPS) and summation residual harmonics (SRH) operate on the magnitude spectrum and turn it into a function in which the fundamental frequency emerges as argmax. In this paper, we investigate the extension of these three techniques to the phase and group delay (GD) domains. Our extensions exploit the observation that the bin at which F(magnitude) becomes maximum, for some monotonically increasing function F, is equivalent to bin at which F(phase) has maximum negative slope and F(groupdelay) has the maximum value. To extract the pitch track from speech phase spectrum, these techniques were coupled with the source-filter model in the phase domain that we proposed in earlier publications and a novel voicing detection algorithm proposed here. The accuracy and robustness of the phase-based pitch extraction techniques are illustrated and compared with their magnitude-based counterparts using six pitch evaluation metrics. On average, it is observed that the phase spectrum can be successfully employed in pitch tracking with comparable accuracy and robustness to the speech magnitude spectrum

Crossref

White Rose Research Online