Search CORE

10,604 research outputs found

Nonuniform Markov models

Author: Ristad Eric Sven
Thomas Robert G.
Publication venue
Publication date: 01/01/1996
Field of study

A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a small, finite number of other symbols in the string. In this report we propose a new way to model conditional independence in Markov models. The central feature of our nonuniform Markov model is that it makes predictions of varying lengths using contexts of varying lengths. Experiments on the Wall Street Journal reveal that the nonuniform model performs slightly better than the classic interpolated Markov model. This result is somewhat remarkable because both models contain identical numbers of parameters whose values are estimated in a similar manner. The only difference between the two models is how they combine the statistics of longer and shorter strings. Keywords: nonuniform Markov model, interpolated Markov model, conditional independence, statistical language model, discrete time series.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

Twofold Video Hashing with Automatic Synchronization

Author: Li Mu
Monga Vishal
Publication venue
Publication date: 21/02/2014
Field of study

Video hashing finds a wide array of applications in content authentication, robust retrieval and anti-piracy search. While much of the existing research has focused on extracting robust and secure content descriptors, a significant open challenge still remains: Most existing video hashing methods are fallible to temporal desynchronization. That is, when the query video results by deleting or inserting some frames from the reference video, most existing methods assume the positions of the deleted (or inserted) frames are either perfectly known or reliably estimated. This assumption may be okay under typical transcoding and frame-rate changes but is highly inappropriate in adversarial scenarios such as anti-piracy video search. For example, an illegal uploader will try to bypass the 'piracy check' mechanism of YouTube/Dailymotion etc by performing a cleverly designed non-uniform resampling of the video. We present a new solution based on dynamic time warping (DTW), which can implement automatic synchronization and can be used together with existing video hashing methods. The second contribution of this paper is to propose a new robust feature extraction method called flow hashing (FH), based on frame averaging and optical flow descriptors. Finally, a fusion mechanism called distance boosting is proposed to combine the information extracted by DTW and FH. Experiments on real video collections show that such a hash extraction and comparison enables unprecedented robustness under both spatial and temporal attacks.Comment: submitted to Image Processing (ICIP), 2014 21st IEEE International Conference o

arXiv.org e-Print Archive

Crossref

GraFIX: a semiautomatic approach for parsing low- and high-quality eye-tracking data

Author: Johnson Mark H.
Saez de Urabain I.R.
Smith Tim J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Fixation durations (FD) have been used widely as a measurement of information processing and attention. However, issues like data quality can seriously influence the accuracy of the fixation detection methods and, thus, affect the validity of our results (Holmqvist, Nyström, & Mulvey, 2012). This is crucial when studying special populations such as infants, where common issues with testing (e.g., high degree of movement, unreliable eye detection, low spatial precision) result in highly variable data quality and render existing FD detection approaches highly time consuming (hand-coding) or imprecise (automatic detection). To address this problem, we present GraFIX, a novel semiautomatic method consisting of a two-step process in which eye-tracking data is initially parsed by using velocity-based algorithms whose input parameters are adapted by the user and then manipulated using the graphical interface, allowing accurate and rapid adjustments of the algorithms’ outcome. The present algorithms (1) smooth the raw data, (2) interpolate missing data points, and (3) apply a number of criteria to automatically evaluate and remove artifactual fixations. The input parameters (e.g., velocity threshold, interpolation latency) can be easily manually adapted to fit each participant. Furthermore, the present application includes visualization tools that facilitate the manual coding of fixations. We assessed this method by performing an intercoder reliability analysis in two groups of infants presenting low- and high-quality data and compared it with previous methods. Results revealed that our two-step approach with adaptable FD detection criteria gives rise to more reliable and stable measures in low- and high-quality data

Springer - Publisher Connector

UAL Research Online

PubMed Central

Birkbeck Institutional Research Online

A DEIM Induced CUR Factorization

Author: Embree M.
Sorensen D. C.
Publication venue
Publication date: 18/09/2015
Field of study

We derive a CUR matrix factorization based on the Discrete Empirical Interpolation Method (DEIM). For a given matrix

A

, such a factorization provides a low rank approximate decomposition of the form

A \approx C U R

, where

C

and

R

are subsets of the columns and rows of

A

, and

U

is constructed to make

CUR

a good approximation. Given a low-rank singular value decomposition

A \approx V S W^T

, the DEIM procedure uses

V

and

W

to select the columns and rows of

A

that form

C

and

R

. Through an error analysis applicable to a general class of CUR factorizations, we show that the accuracy tracks the optimal approximation error within a factor that depends on the conditioning of submatrices of

V

and

W

. For large-scale problems,

V

and

W

can be approximated using an incremental QR algorithm that makes one pass through

A

. Numerical examples illustrate the favorable performance of the DEIM-CUR method, compared to CUR approximations based on leverage scores

arXiv.org e-Print Archive

CiteSeerX

Crossref

DSpace at Rice University

Sequential Bayesian inference for static parameters in dynamic state space models

Author: Bhattacharya Arnab
Wilson Simon
Publication venue
Publication date: 27/06/2017
Field of study

A method for sequential Bayesian inference of the static parameters of a dynamic state space model is proposed. The method is based on the observation that many dynamic state space models have a relatively small number of static parameters (or hyper-parameters), so that in principle the posterior can be computed and stored on a discrete grid of practical size which can be tracked dynamically. Further to this, this approach is able to use any existing methodology which computes the filtering and prediction distributions of the state process. Kalman filter and its extensions to non-linear/non-Gaussian situations have been used in this paper. This is illustrated using several applications: linear Gaussian model, Binomial model, stochastic volatility model and the extremely non-linear univariate non-stationary growth model. Performance has been compared to both existing on-line method and off-line methods

arXiv.org e-Print Archive

CiteSeerX

Network optimization algorithms and scenarios in the context of automatic mapping

Author: Baume O.P.
Gebhardt A.
Gebhardt C.
Heuvelink G.B.M.
Pilz J.
Publication venue
Publication date: 01/01/2009
Field of study

Wageningen University & Research Publications

Memory-Based Learning: Using Similarity for Smoothing

Author: Daelemans Walter
Zavrel Jakub
Publication venue
Publication date: 01/01/1997
Field of study

This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.Comment: 8 pages, uses aclap.sty, To appear in Proc. ACL/EACL 9

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository