12 research outputs found

    Improved estimation of hidden Markov model parameters from multiple observation sequences

    Get PDF
    The huge popularity of Hidden Markov models in pattern recognition is due to the ability to 'learn' model parameters from an observation sequence through Baum-Welch and other re-estimation procedures. In the case of HMM parameter estimation from an ensemble of observation sequences, rather than a single sequence, we require techniques for finding the parameters which maximize the likelihood of the estimated model given the entire set of observation sequences. The importance of this study is that HMMs with parameters estimated from multiple observations are shown to be many orders of magnitude more probable than HMM models learned from any single observation sequence - thus the effectiveness of HMM 'learning' is greatly enhanced. In this paper, we present techniques that usually find models significantly more likely than Rabiner's well-known method on both seen and unseen sequences

    An efficient hidden Markov model training scheme for anomaly intrusion detection of server applications based on system calls

    Get PDF
    Recently hidden Markov model (HMM) has been proved to be a good tool to model normal behaviours of privileged processes for anomaly intrusion detection based on system calls. However, one major problem with this approach is that it demands excessive computing resources in the HMM training process, which makes it inefficient for practical intrusion detection systems. In this paper a simple and efficient HMM training scheme is proposed by the innovative integration of multiple-observations training and incremental HMM training. The proposed scheme first divides the long observation sequence into multiple subsets of sequences. Next each subset of data is used to infer one sub-model, and then this sub-model is incrementally merged into the final HMM model. Our experimental results show that our HMM training scheme can reduce the training time by about 60% compared to that of the conventional batch training. The results also show that our HMM-based detection model is able to detect all denial-of-service attacks embedded in testing traces

    Comparing and Evaluating HMM Ensemble Training Algorithms Using Train and Test and Condition Number Criteria

    Get PDF
    Hidden Markov Models have many applications in signal processing and pattern recognition, but their convergence-based training algorithms are known to suffer from over-sensitivity to the initial random model choice. This paper describes the boundary between regions in which ensemble learning is superior to Rabiner's multiplesequence Baum-Welch training method, and proposes techniques for determining the best method in any arbitrary situation. It also studies the suitability of the training methods using the condition number, a recently proposed diagnostic tool for testing the quality of the model. A new method for training Hidden Markov Models called the Viterbi Path counting algorithm is introduced and is found to produce significantly better performance than current methods in a range of trials

    Investigation of Training Algorithms for Hidden Markov Models Applied to Automatic Speech Recognition

    Get PDF
    The work presented in this thesis focuses on simulating a speech recognizer which is trained by different people with different speaking styles and investigates how sensitive the training and recognition processes are to the variations in the training data. There are four main parts to this work. The first involves an experiment of weighting methods for training with multiple observation sequences. The second involves the testing of different initial parameters. The third part includes the first experiment involving training with multiple observation sequences. The model\u27s sensitivity to variations in training data was evaluated by comparing the cases of different values of variation. The final part varied the observation vectors with the variation restricted to only one of the eight positions in the sequence. The experiment was repeated for each of eight positions in the observation sequence, and the effect on recognition was evaluated

    Learning discrete Hidden Markov Models from state distribution vectors

    Get PDF
    Hidden Markov Models (HMMs) are probabilistic models that have been widely applied to a number of fields since their inception in the late 1960’s. Computational Biology, Image Processing, and Signal Processing, are but a few of the application areas of HMMs. In this dissertation, we develop several new efficient learning algorithms for learning HMM parameters. First, we propose a new polynomial-time algorithm for supervised learning of the parameters of a first order HMM from a state probability distribution (SD) oracle. The SD oracle provides the learner with the state distribution vector corresponding to a query string. We prove the correctness of the algorithm and establish the conditions under which it is guaranteed to construct a model that exactly matches the oracle’s target HMM. We also conduct a simulation experiment to test the viability of the algorithm. Furthermore, the SD oracle is proven to be necessary for polynomial-time learning in the sense that the consistency problem for HMMs, where a training set of state distribution vectors such as those provided by the SD oracle is used but without the ability to query on arbitrary strings, is NP-complete. Next, we define helpful distributions on an instance set of strings for which polynomial-time HMM learning from state distribution vectors is feasible in the absence of an SD oracle and propose a new PAC-learning algorithm under helpful distribution for HMM parameters. The PAC-learning algorithm ensures with high probability that HMM parameters can be learned from training examples without asking queries. Furthermore, we propose a hybrid learning algorithm for approximating HMM parameters from a dataset composed of strings and their corresponding state distribution vectors, and provide supporting experimental data, which indicates our hybrid algorithm produces more accurate approximations than the existing method

    Improved Estimation of Hidden Markov Model Parameters from Multiple Observation Sequences

    No full text
    The huge popularity of Hidden Markov models in pattern recognition is due to the ability to "learn" model parameters from an observation sequence through Baum-Welch and other re-estimation procedures. In the case of HMM parameter estimation from an ensemble of observation sequences, rather than a single sequence, we require techniques for finding the parameters which maximize the likelihood of the estimated model given the entire set of observation sequences. The importance of this study is that HMMs with parameters estimated from multiple observations are shown to be many orders of magnitude more probable than HMM models learned from any single observation sequence --- thus the effectiveness of HMM "learning" is greatly enhanced. In this paper, we present techniques that usually find models significantly more likely than Rabiner's wellknown method on both seen and unseen sequences

    A review and application of hidden Markov models and double chain Markov models

    Get PDF
    A Dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Master of Science. Johannesburg, 2016.Hidden Markov models (HMMs) and double chain Markov models (DCMMs) are classical Markov model extensions used in a range of applications in the literature. This dissertation provides a comprehensive review of these models with focus on i) providing detailed mathematical derivations of key results - some of which, at the time of writing, were not found elsewhere in the literature, ii) discussing estimation techniques for unknown model parameters and the hidden state sequence, and iii) discussing considerations which practitioners of these models would typically take into account. Simulation studies are performed to measure statistical properties of estimated model parameters and the estimated hidden state path - derived using the Baum-Welch algorithm (BWA) and the Viterbi Algorithm (VA) respectively. The effectiveness of the BWA and the VA is also compared between the HMM and DCMM. Selected HMM and DCMM applications are reviewed and assessed in light of the conclusions drawn from the simulation study. Attention is given to application in the field of Credit Risk.LG201

    E-commerce security enhancement and anomaly intrusion detection using machine learning techniques

    Get PDF
    With the fast growth of the Internet and the World Wide Web, security has become a major concern of many organizations, enterprises and users. Criminal attacks and intrusions into computer and information systems are spreading quickly and they can come from anywhere on the globe. Intrusion prevention measures, such as user authentication, firewalls and cryptography have been used as the first line of defence to protect computer and information systems from intrusions. As intrusion prevention alone may not be sufficient in a highly dynamic environment, such as the Internet, intrusion detection has been used as the second line of defence against intrusions. However, existing cryptography-based intrusion prevention measures implemented in software, have problems with the protection of long-term private keys and the degradation of system performance. Moreover, the security of these software-based intrusion prevention measures depends on the security of the underlying operating system, and therefore they are vulnerable to threats caused by security flaws of the underlying operating system. On the other hand, existing anomaly intrusion detection approaches usually produce excessive false alarms. They also lack in efficiency due to high construction and maintenance costs. In our approach, we employ the "defence in depth" principle to develop a solution to solve these problems. Our solution consists of two lines of defence: preventing intrusions at the first line and detecting intrusions at the second line if the prevention measures of the first line have been penetrated. At the first line of defence, our goal is to develop an encryption model that enhances communication and end-system security, and improves the performance of web-based E-commerce systems. We have developed a hardware-based RSA encryption model to address the above mentioned problems of existing software-based intrusion prevention measures. The proposed hardware-based encryption model is based on the integration of an existing web-based client/server model and embedded hardware-based RSA encryption modules. DSP embedded hardware is selected to develop the proposed encryption model because of its advanced security features and high processing capability. The experimental results showed that the proposed DSP hardware-based RSA encryption model outperformed the software-based RSA implementation running on Pentium 4 machines that have almost double clock speed of the DSP's clock speed at large RSA encryption keys. At the second line of defence, our goal is to develop an anomaly intrusion detection model that improves the detection accuracy, efficiency and adaptability of existing anomaly detection approaches. Existing anomaly detection systems are not effective as they usually produce excessive false alarms. In addition, several anomaly detection approaches suffer a serious efficiency problem due to high construction costs of the detection profiles. High construction costs will eventually reduce the applicability of these approaches in practice. Furthermore, existing anomaly detection systems lack in adaptability because no mechanisms are provided to update their detection profiles dynamically, in order to adapt to the changes of the behaviour of monitored objects. We have developed a model for program anomaly intrusion detection to address these problems. The proposed detection model uses a hidden Markov model (HMM) to characterize normal program behaviour using system calls. In order to increase the detection rate and to reduce the false alarm rate, we propose two detection schemes: a two-layer detection scheme and a fuzzy-based detection scheme. The two-layer detection scheme aims at reducing false alarms by applying a double-layer test on each sequence of test traces of system calls. On the other hand, the fuzzy-based detection scheme focuses on further improving the detection rate, as well as reducing false alarms. It employs the fuzzy inference to combine multiple sequence information to correctly determine the sequence status. The experimental results showed that the proposed detection schemes reduced false alarms by approximately 48%, compared to the normal database scheme. In addition, our detection schemes generated strong anomaly signals for all tested traces, which in turn improve the detection rate. We propose an HMM incremental training scheme with optimal initialization to address the efficiency problem by reducing the construction costs, in terms of model training time and storage demand. Unlike the HMM batch training scheme, which updates the HMM model using the complete training set, our HMM incremental training scheme incrementally updates the HMM model using one training subset at a time, until convergence. The experimental results showed that the proposed HMM incremental training scheme reduced training time four-fold, compared to the HMM batch training, based on the well-known Baum-Welch algorithm. The proposed training scheme also reduced storage demand substantially, as the size of each training subset is significantly smaller than the size of the complete training set. We also describe our complete model for program anomaly detection using system calls in chapter 8. The complete model consists of two development stages: training stage and testing stage. In the training stage, an HMM model and a normal database are constructed to represent normal program behaviour. In addition, fuzzy sets and rules are defined to represent the space and combined conditions of the sequence parameters. In the testing stage, the HMM model and the normal database, are used to generate the sequence parameters which are used as the input for the fuzzy inference engine to evaluate each sequence of system calls for anomalies and possible intrusions. The proposed detection model also provides a mechanism to update its detection profile (the HMM model and the normal database) using online training data. This makes the proposed detection model up-to-date, and therefore, maintains the detection accuracy
    corecore