1,531 research outputs found
Sequential labeling with structural SVM under an average precision loss
© Springer International Publishing AG 2016. The average precision (AP) is an important and widelyadopted performance measure for information retrieval and classification systems. However, owing to its relatively complex formulation, very few approaches have been proposed to learn a classifier by maximising its average precision over a given training set. Moreover, most of the existing work is restricted to i.i.d. data and does not extend to sequential data. For this reason, we herewith propose a structural SVM learning algorithm for sequential labeling that maximises an average precision measure. A further contribution of this paper is an algorithm that computes the average precision of a sequential classifier at test time, making it possible to assess sequential labeling under this measure. Experimental results over challenging datasets which depict human actions in kitchen scenarios (i.e., TUM Kitchen and CMU Multimodal Activity) show that the proposed approach leads to an average precision improvement of up to 4.2 and 5.7% points against the runner-up, respectively
Bernoulli HMMs at subword level for handwritten word recognition
This paper presents a handwritten word recogniser based on HMMs at subword level (characters) in which state-emission probabilities are governed by multivariate Bernoulli probability functions. This recogniser works directly with raw binary pixels of the image, instead of conventional, real-valued local features. A detailed experimentation has been carried out by varying the number of states, and comparing the results with those from a conventional system based on continuous (Gaussian) densities. From this experimentation, it becomes clear that the proposed recogniser is much better than the conventional systemWork supported by the EC (FEDER) and the Spanish MEC under the MIPRCV “Consolider Ingenio 2010” research programme (CSD2007-00018), the iTransDoc research project (TIN2006-15694-CO2-01), and the FPU grant AP2005-1840.Giménez Pastor, A.; Juan, A. (2009). Bernoulli HMMs at subword level for handwritten word recognition. En Pattern Recognition and Image Analysis. Springer Verlag (Germany). 497-504. https://doi.org/10.1007/978-3-642-02172-5_64S497504Giménez-Pastor, A., Juan-Císcar, A.: Bernoulli HMMs for Off-line Handwriting Recognition. In: Proc. of the 8th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2008), Barcelona, Spain, pp. 86–91 (June 2008)Günter, S., Bunke, H.: HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components. Pattern Recognition 37, 2069–2079 (2004)Gadea, M.P.: Aportaciones al reconocimiento automático de texto manuscrito. PhD thesis, Dep. de Sistemes Informàtics i Computació, València, Spain. Advisors: Vidal, E., Tosselli, A.H. (October 2007)Juan, A., Vidal, E.: Bernoulli mixture models for binary images. In: Proc. of the 17th Int. Conf. on Pattern Recognition (ICPR 2004), Cambridge, UK, vol. 3 (August 2004)Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. 5(1), 39–46 (2002)Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. Prentice-Hall, Englewood Cliffs (1993)Romero, V., Giménez, A., Juan, A.: Explicit Modelling of Invariances in Bernoulli Mixtures for Binary Images. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS (LNAI), vol. 4477, pp. 539–546. Springer, Heidelberg (2007)Young, S., et al.: The HTK Book. Cambridge University Engineering Department (1995
Audio-based event detection for sports video
In this paper, we present an audio-based event detection approach shown to be effective when applied to the Sports broadcast data. The main benefit of this approach is the ability to recognise patterns that indicate high levels of crowd response which can be correlated to key events. By applying Hidden Markov Model-based classifiers, where the predefined content classes are parameterised using Mel-Frequency Cepstral Coefficients, we were able to eliminate the need for defining a heuristic set of rules to determine event detection, thus avoiding a two-class approach shown not to be suitable for this problem. Experimentation indicated that this is an effective method for classifying crowd response in Soccer matches, thus providing a basis for automatic indexing and summarisation
Is a multiple excitation of a single atom equivalent to a single excitation of an ensemble of atoms?
Recent technological advances have enabled to isolate, control and measure
the properties of a single atom, leading to the possibility to perform
statistics on the behavior of single quantum systems. These experiments have
enabled to check a question which was out of reach previously: Is the
statistics of a repeatedly excitation of an atom N times equivalent to a single
excitation of an ensemble of N atoms? We present a new method to analyze
quantum measurements which leads to the postulation that the answer is most
probably no. We discuss the merits of the analysis and its conclusion.Comment: 3 pages, 3 figure
Summed Parallel Infinite Impulse Response (SPIIR) Filters For Low-Latency Gravitational Wave Detection
With the upgrade of current gravitational wave detectors, the first detection
of gravitational wave signals is expected to occur in the next decade.
Low-latency gravitational wave triggers will be necessary to make fast
follow-up electromagnetic observations of events related to their source, e.g.,
prompt optical emission associated with short gamma-ray bursts. In this paper
we present a new time-domain low-latency algorithm for identifying the presence
of gravitational waves produced by compact binary coalescence events in noisy
detector data. Our method calculates the signal to noise ratio from the
summation of a bank of parallel infinite impulse response (IIR) filters. We
show that our summed parallel infinite impulse response (SPIIR) method can
retrieve the signal to noise ratio to greater than 99% of that produced from
the optimal matched filter. We emphasise the benefits of the SPIIR method for
advanced detectors, which will require larger template banks.Comment: 9 pages, 6 figures, for PR
Towards low-latency real-time detection of gravitational waves from compact binary coalescences in the era of advanced detectors
Electromagnetic (EM) follow-up observations of gravitational wave (GW) events
will help shed light on the nature of the sources, and more can be learned if
the EM follow-ups can start as soon as the GW event becomes observable. In this
paper, we propose a computationally efficient time-domain algorithm capable of
detecting gravitational waves (GWs) from coalescing binaries of compact objects
with nearly zero time delay. In case when the signal is strong enough, our
algorithm also has the flexibility to trigger EM observation before the merger.
The key to the efficiency of our algorithm arises from the use of chains of
so-called Infinite Impulse Response (IIR) filters, which filter time-series
data recursively. Computational cost is further reduced by a template
interpolation technique that requires filtering to be done only for a much
coarser template bank than otherwise required to sufficiently recover optimal
signal-to-noise ratio. Towards future detectors with sensitivity extending to
lower frequencies, our algorithm's computational cost is shown to increase
rather insignificantly compared to the conventional time-domain correlation
method. Moreover, at latencies of less than hundreds to thousands of seconds,
this method is expected to be computationally more efficient than the
straightforward frequency-domain method.Comment: 19 pages, 6 figures, for PR
Hierarchical multi-stream posterior based speech secognition system
Abstract. In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of features and taking into account acoustic context (e.g., as available in the whole utterance), as well as possible prior information (such as topological constraints). These posteriors are estimated based on “state gamma posterior ” definition (typically used in standard HMMs training) extended to the case of multi-stream HMMs.This approach provides a new, principled, theoretical framework for hierarchical estimation/use of posteriors, multi-stream feature combination, and integrating appropriate context and prior knowledge in posterior estimates. In the present work, we used the resulting gamma posteriors as features for a standard HMM/GMM layer. On the OGI Digits database and on a reduced vocabulary version (1000 words) of the DARPA Conversational Telephone Speech-to-text (CTS) task, this resulted in significant performance improvement, compared to the stateof-the-art Tandem systems.
Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification
The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards
Marimba:A tool for verifying properties of hidden markov models
The formal verification of properties of Hidden Markov Models (HMMs) is
highly desirable for gaining confidence in the correctness of the model and the
corresponding system. A significant step towards HMM verification was the
development by Zhang et al. of a family of logics for verifying HMMs, called
POCTL*, and its model checking algorithm. As far as we know, the verification
tool we present here is the first one based on Zhang et al.'s approach. As an
example of its effective application, we verify properties of a handover task
in the context of human-robot interaction. Our tool was implemented in Haskell,
and the experimental evaluation was performed using the humanoid robot Bert2.Comment: Tool paper accepted in the 13th International Symposium on Automated
Technology for Verification and Analysis (ATVA 2015
Segmental K-Means Learning with Mixture Distribution for HMM Based Handwriting Recognition
This paper investigates the performance of hidden Markov models (HMMs) for handwriting recognition. The Segmental K-Means algorithm is used for updating the transition and observation probabilities, instead of the Baum-Welch algorithm. Observation probabilities are modelled as multi-variate Gaussian mixture distributions. A deterministic clustering technique is used to estimate the initial parameters of an HMM. Bayesian information criterion (BIC) is used to select the topology of the model. The wavelet transform is used to extract features from a grey-scale image, and avoids binarization of the image.</p
- …