Search CORE

134 research outputs found

A Spectral Algorithm for Learning Hidden Markov Models

Author: Hsu Daniel
Kakade Sham M.
Zhang Tong
Publication venue
Publication date: 01/01/2012
Field of study

Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard (under cryptographic assumptions), and practitioners typically resort to search heuristics which suffer from the usual local optima issues. We prove that under a natural separation condition (bounds on the smallest singular value of the HMM parameters), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations---it implicitly depends on this quantity through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple, employing only a singular value decomposition and matrix multiplications.Comment: Published in JCSS Special Issue "Learning Theory 2009

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

ScholarlyCommons@Penn

Policy Misuse Detection in Communication Networks with Hidden Markov Models

Author: Tosun Umut
Publication venue: Published by Elsevier B.V.
Publication date: 01/01/2002
Field of study

AbstractWith the recent advances in computer networking applications, Intrusion Detection Systems (IDS) are widely used to detect the malicious connections in computer networks. IDS provide a high level security between organizations while preventing misuses and intrusions in data communication through internet or any other network. Adherence to network usage policies is crucial since a system or network administrator needs to be informed whether the information is compromised, if the resources are appropriately used or if an attacker exploits a comprised service. Server flow authentication via protocol detection analyzes penetrations to a communication network. Generally, port numbers in the packet headers are used to detect the protocols. However, it is easy to re-map port numbers via proxies and changing the port number via compromised host services. Using port numbers may be misleading for a system administrator to understand the natural flow of communications through network. It is also difficult to understand the user behavior when the traffic is encrypted since there is only packet level information to be considered. In this paper, we present a novel approach via Hidden Markov Models to detect user behavior in network traffic. We perform the detection process on timing measures of packets. The results are promising and we obtained classification accuracies between %70 and %100

Elsevier - Publisher Connector

Crossref

Boston University Institutional Repository (OpenBU)

A Unifying review of linear gaussian models

Author: Ghahramani Zoubin
Roweis Sam
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/1999
Field of study

Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observations and derivations made by many previous authors and introducing a new way of linking discrete and continuous state models using a simple nonlinearity. Through the use of other nonlinearities, we show how independent component analysis is also a variation of the same basic generative model.We show that factor analysis and mixtures of gaussians can be implemented in autoencoder neural networks and learned using squared error plus the same regularization term. We introduce a new model for static data, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise. We also review some of the literature involving global and local mixtures of the basic models and provide pseudocode for inference and learning for all the basic models

CiteSeerX

Caltech Authors

Trip destination prediction based on past GPS log using a Hidden Markov Model

Author: González Abril Luis
Ortega Ramírez Juan Antonio
Velasco Morente Francisco
Álvarez García Juan Antonio
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

In this paper, a system based on the generation of a Hidden Markov Model from the past GPS log and cur- rent location is presented to predict a user’s destination when beginning a new trip. This approach dras- tically reduces the number of points supplied by the GPS device and it permits a ‘‘support-map” to be generated in which the main characteristics of the trips for each user are taken into account. Hence, in contrast with other similar approaches, total independence from a street-map database is achievedMinisterio de Educación y Ciencia TSI2006–13390-C02–02Junta de Andalucia TIC214

idUS. Depósito de Investigación Universidad de Sevilla

Predicting Daily Probability Distributions Of S&P500 Returns

Author: Shi Shanming
Weigend Andreas S.
Publication venue: Stern School of Business, New York University
Publication date: 01/08/1998
Field of study

Most approaches in forecasting merely try to predict the next value of the time series. In contrast, this paper presents a framework to predict the full probability distribution. It is expressed as a mixture model: the dynamics of the individual states is modeled with so-called "experts" (potentially nonlinear neural networks), and the dynamics between the states is modeled using a hidden Markov approach. The full density predictions are obtained by a weighted superposition of the individual densities of each expert. This model class is called "hidden Markov experts". Results are presented for daily S&P500 data. While the predictive accuracy of the mean does not improve over simpler models, evaluating the prediction of the full density shows a clear out-of-sample improvement both over a simple GARCH(1,l) model (which assumes Gaussian distributed returns) and over a "gated experts" model (which expresses the weighting for each state non-recursively as a function of external inputs). Several interpretations are given: the blending of supervised and unsupervised learning, the discovery of hidden states, the combination of forecasts, the specialization of experts, the removal of outliers, and the persistence of volatility.Information Systems Working Papers Serie

New York University Faculty Digital Archive