1,531 research outputs found

    Sequential labeling with structural SVM under an average precision loss

    Full text link
    © Springer International Publishing AG 2016. The average precision (AP) is an important and widelyadopted performance measure for information retrieval and classification systems. However, owing to its relatively complex formulation, very few approaches have been proposed to learn a classifier by maximising its average precision over a given training set. Moreover, most of the existing work is restricted to i.i.d. data and does not extend to sequential data. For this reason, we herewith propose a structural SVM learning algorithm for sequential labeling that maximises an average precision measure. A further contribution of this paper is an algorithm that computes the average precision of a sequential classifier at test time, making it possible to assess sequential labeling under this measure. Experimental results over challenging datasets which depict human actions in kitchen scenarios (i.e., TUM Kitchen and CMU Multimodal Activity) show that the proposed approach leads to an average precision improvement of up to 4.2 and 5.7% points against the runner-up, respectively

    Bernoulli HMMs at subword level for handwritten word recognition

    Full text link
    This paper presents a handwritten word recogniser based on HMMs at subword level (characters) in which state-emission probabilities are governed by multivariate Bernoulli probability functions. This recogniser works directly with raw binary pixels of the image, instead of conventional, real-valued local features. A detailed experimentation has been carried out by varying the number of states, and comparing the results with those from a conventional system based on continuous (Gaussian) densities. From this experimentation, it becomes clear that the proposed recogniser is much better than the conventional systemWork supported by the EC (FEDER) and the Spanish MEC under the MIPRCV “Consolider Ingenio 2010” research programme (CSD2007-00018), the iTransDoc research project (TIN2006-15694-CO2-01), and the FPU grant AP2005-1840.Giménez Pastor, A.; Juan, A. (2009). Bernoulli HMMs at subword level for handwritten word recognition. En Pattern Recognition and Image Analysis. Springer Verlag (Germany). 497-504. https://doi.org/10.1007/978-3-642-02172-5_64S497504Giménez-Pastor, A., Juan-Císcar, A.: Bernoulli HMMs for Off-line Handwriting Recognition. In: Proc. of the 8th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2008), Barcelona, Spain, pp. 86–91 (June 2008)Günter, S., Bunke, H.: HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components. Pattern Recognition 37, 2069–2079 (2004)Gadea, M.P.: Aportaciones al reconocimiento automático de texto manuscrito. PhD thesis, Dep. de Sistemes Informàtics i Computació, València, Spain. Advisors: Vidal, E., Tosselli, A.H. (October 2007)Juan, A., Vidal, E.: Bernoulli mixture models for binary images. In: Proc. of the 17th Int. Conf. on Pattern Recognition (ICPR 2004), Cambridge, UK, vol. 3 (August 2004)Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition.  5(1), 39–46 (2002)Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. Prentice-Hall, Englewood Cliffs (1993)Romero, V., Giménez, A., Juan, A.: Explicit Modelling of Invariances in Bernoulli Mixtures for Binary Images. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS (LNAI), vol. 4477, pp. 539–546. Springer, Heidelberg (2007)Young, S., et al.: The HTK Book. Cambridge University Engineering Department (1995

    Audio-based event detection for sports video

    Get PDF
    In this paper, we present an audio-based event detection approach shown to be effective when applied to the Sports broadcast data. The main benefit of this approach is the ability to recognise patterns that indicate high levels of crowd response which can be correlated to key events. By applying Hidden Markov Model-based classifiers, where the predefined content classes are parameterised using Mel-Frequency Cepstral Coefficients, we were able to eliminate the need for defining a heuristic set of rules to determine event detection, thus avoiding a two-class approach shown not to be suitable for this problem. Experimentation indicated that this is an effective method for classifying crowd response in Soccer matches, thus providing a basis for automatic indexing and summarisation

    Is a multiple excitation of a single atom equivalent to a single excitation of an ensemble of atoms?

    Full text link
    Recent technological advances have enabled to isolate, control and measure the properties of a single atom, leading to the possibility to perform statistics on the behavior of single quantum systems. These experiments have enabled to check a question which was out of reach previously: Is the statistics of a repeatedly excitation of an atom N times equivalent to a single excitation of an ensemble of N atoms? We present a new method to analyze quantum measurements which leads to the postulation that the answer is most probably no. We discuss the merits of the analysis and its conclusion.Comment: 3 pages, 3 figure

    Summed Parallel Infinite Impulse Response (SPIIR) Filters For Low-Latency Gravitational Wave Detection

    Get PDF
    With the upgrade of current gravitational wave detectors, the first detection of gravitational wave signals is expected to occur in the next decade. Low-latency gravitational wave triggers will be necessary to make fast follow-up electromagnetic observations of events related to their source, e.g., prompt optical emission associated with short gamma-ray bursts. In this paper we present a new time-domain low-latency algorithm for identifying the presence of gravitational waves produced by compact binary coalescence events in noisy detector data. Our method calculates the signal to noise ratio from the summation of a bank of parallel infinite impulse response (IIR) filters. We show that our summed parallel infinite impulse response (SPIIR) method can retrieve the signal to noise ratio to greater than 99% of that produced from the optimal matched filter. We emphasise the benefits of the SPIIR method for advanced detectors, which will require larger template banks.Comment: 9 pages, 6 figures, for PR

    Towards low-latency real-time detection of gravitational waves from compact binary coalescences in the era of advanced detectors

    Get PDF
    Electromagnetic (EM) follow-up observations of gravitational wave (GW) events will help shed light on the nature of the sources, and more can be learned if the EM follow-ups can start as soon as the GW event becomes observable. In this paper, we propose a computationally efficient time-domain algorithm capable of detecting gravitational waves (GWs) from coalescing binaries of compact objects with nearly zero time delay. In case when the signal is strong enough, our algorithm also has the flexibility to trigger EM observation before the merger. The key to the efficiency of our algorithm arises from the use of chains of so-called Infinite Impulse Response (IIR) filters, which filter time-series data recursively. Computational cost is further reduced by a template interpolation technique that requires filtering to be done only for a much coarser template bank than otherwise required to sufficiently recover optimal signal-to-noise ratio. Towards future detectors with sensitivity extending to lower frequencies, our algorithm's computational cost is shown to increase rather insignificantly compared to the conventional time-domain correlation method. Moreover, at latencies of less than hundreds to thousands of seconds, this method is expected to be computationally more efficient than the straightforward frequency-domain method.Comment: 19 pages, 6 figures, for PR

    Hierarchical multi-stream posterior based speech secognition system

    Get PDF
    Abstract. In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of features and taking into account acoustic context (e.g., as available in the whole utterance), as well as possible prior information (such as topological constraints). These posteriors are estimated based on “state gamma posterior ” definition (typically used in standard HMMs training) extended to the case of multi-stream HMMs.This approach provides a new, principled, theoretical framework for hierarchical estimation/use of posteriors, multi-stream feature combination, and integrating appropriate context and prior knowledge in posterior estimates. In the present work, we used the resulting gamma posteriors as features for a standard HMM/GMM layer. On the OGI Digits database and on a reduced vocabulary version (1000 words) of the DARPA Conversational Telephone Speech-to-text (CTS) task, this resulted in significant performance improvement, compared to the stateof-the-art Tandem systems.

    Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification

    Get PDF
    The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

    Marimba:A tool for verifying properties of hidden markov models

    Get PDF
    The formal verification of properties of Hidden Markov Models (HMMs) is highly desirable for gaining confidence in the correctness of the model and the corresponding system. A significant step towards HMM verification was the development by Zhang et al. of a family of logics for verifying HMMs, called POCTL*, and its model checking algorithm. As far as we know, the verification tool we present here is the first one based on Zhang et al.'s approach. As an example of its effective application, we verify properties of a handover task in the context of human-robot interaction. Our tool was implemented in Haskell, and the experimental evaluation was performed using the humanoid robot Bert2.Comment: Tool paper accepted in the 13th International Symposium on Automated Technology for Verification and Analysis (ATVA 2015

    Segmental K-Means Learning with Mixture Distribution for HMM Based Handwriting Recognition

    Full text link
    This paper investigates the performance of hidden Markov models (HMMs) for handwriting recognition. The Segmental K-Means algorithm is used for updating the transition and observation probabilities, instead of the Baum-Welch algorithm. Observation probabilities are modelled as multi-variate Gaussian mixture distributions. A deterministic clustering technique is used to estimate the initial parameters of an HMM. Bayesian information criterion (BIC) is used to select the topology of the model. The wavelet transform is used to extract features from a grey-scale image, and avoids binarization of the image.</p
    corecore