643 research outputs found
Duration and Interval Hidden Markov Model for Sequential Data Analysis
Analysis of sequential event data has been recognized as one of the essential
tools in data modeling and analysis field. In this paper, after the examination
of its technical requirements and issues to model complex but practical
situation, we propose a new sequential data model, dubbed Duration and Interval
Hidden Markov Model (DI-HMM), that efficiently represents "state duration" and
"state interval" of data events. This has significant implications to play an
important role in representing practical time-series sequential data. This
eventually provides an efficient and flexible sequential data retrieval.
Numerical experiments on synthetic and real data demonstrate the efficiency and
accuracy of the proposed DI-HMM
Hidden Quantum Markov Models and Open Quantum Systems with Instantaneous Feedback
Hidden Markov Models are widely used in classical computer science to model
stochastic processes with a wide range of applications. This paper concerns the
quantum analogues of these machines --- so-called Hidden Quantum Markov Models
(HQMMs). Using the properties of Quantum Physics, HQMMs are able to generate
more complex random output sequences than their classical counterparts, even
when using the same number of internal states. They are therefore expected to
find applications as quantum simulators of stochastic processes. Here, we
emphasise that open quantum systems with instantaneous feedback are examples of
HQMMs, thereby identifying a novel application of quantum feedback control.Comment: 10 Pages, proceedings for the Interdisciplinary Symposium on Complex
Systems in Florence, September 2014, minor correction
On-Line Handwritten Formula Recognition using Hidden Markov Models and Context Dependent Graph Grammars
This paper presents an approach for the recognition of on-line handwritten mathematical expressions. The Hidden Markov Model (HMM) based system makes use of simultaneous segmentation and recognition capabilities, avoiding a crucial segmentation during pre-processing. With the segmentation and recognition results, obtained from the HMMrecognizer, it is possible to analyze and interpret the spatial two-dimensional arrangement of the symbols. We use a graph grammar approach for the structure recognition, also used in off-line recognition process, resulting in a general tree-structure of the underlying input-expression. The resulting constructed tree can be translated to any desired syntax (for example: Lisp, LaTeX, OpenMath . . . )
拡張隠れセミマルコフモデルによる複数系列データモデリングとデータ収集・管理手法
In recent years, with the development of devices and the development of data aggregation methods, data to be analyzed and aggregating methods have been changed. Regarding the environment of Internet of Things (IoT), sensors or devices are connected to the communication terminal as access point or mobile phone and the terminal aggregate the sensing data and upload them to the cloud server. From the viewpoint of analysis, the aggregated data are sequential data and the grouped sequence is a meaningful set of sequences because the group represents the owner\u27s information. However, most of the researches for sequential data analysis are specialized for the target data, and not focusing on the "grouped" sequences. In addition from the viewpoint of aggregation, it needs to prepare the special terminals as an access point. The preparation of the equipment takes labor and cost. To analyze the "grouped" sequence and aggregate them without any preparation, this paper aims to realize the analysis method for grouped sequences and to realize the aggregation environment virtually. For analysis of grouped sequential data, we firstly analyze the grouped sequential data focusing on the event sequences and extract the requirements for their modeling. The requirements are (1) the order of events, (2) the duration of the event, (3) the interval between two events, and (4) the overlap of the event. To satisfy all requirements, this paper focuses on the Hidden Semi Markov Model (HSMM) as a base model because it can model the order of events and the duration of event. Then, we consider how to model these sequences with HSMM and propose its extensions. For the former consideration, we propose two models; duration and interval hidden semi-Markov model and interval state hidden-semi Markov model to satisfy both the duration of event and the interval between events simultaneously. For the latter consideration, we consider how to satisfy all requirements including the overlap of the events and propose a new modeling methodology, over-lapped state hidden semi-Markov model. The performance of each method are shown compared with HSMM from the view point of the training and recognition time, the decoding performance, and the recognition performance in the simulation experiment. In the evaluation, practical application data are also used in the simulation and it shows the effectiveness. For the data aggregation, most of conventional approaches for aggregating the grouped data are limited using pre-allocated access points or terminals. It can obtain the grouped data stably, but it needs to additional cost to allocate such terminals in order to aggregate a new group of sequences. Therefore, this paper focus on "area based information" as a target of the grouped sequences, and propose an extraordinary method to store such information using the storage of the terminals that exist in the area. It realize the temporary area based storage virtually by relaying the information with existing terminals in the area. In this approach, it is necessary to restrict the labor of terminals and also store the information as long as possible. To control optimally while the trade-off, we propose methods to control the relay timing and the size of the target storage area in ad hoc dynamically. Simulators are established as practical environment to evaluate the performance of both controlling method. The results show the effectiveness of our method compared with flooding based relay control. As a result of above proposal and evaluation, methods for the grouped sequential data modeling and its aggregation are appeared. Finally, we summarize the research with applicable examples.電気通信大学201
Recommended from our members
A high level approach to Arabic sentence recognition
The aim of this work is to develop sentence recognition system inspired by the human reading process. Cognitive studies observed that the human tended to read a word as a whole at a time. He considers the global word shapes and uses contextual knowledge to infer and discriminate a word among other possible words. The sentence recognition system is a fully integrated system; a word level recogniser (baseline system) integrated with linguistic knowledge post-processing module. The presented baseline system is holistic word-based recognition approach characterised as probabilistic ranked task. The output of the system is multiple recognition hypotheses (N-best word lattice). The basic unit is the word rather than the character; it does not rely on any segmentation or require baseline detection. The considered linguistic knowledge to re-rank the output of the existing baseline system is the standard n-gram Statistical Language Models (SLMs). The candidates are re-ranked through exploiting phrase perplexity score. The system is an OCR system that depends on HMM models utilizing the HTK Toolkit. The baseline system supported by global transformation features extracted from binary word images. The adopted features' extraction technique is the block-based Discrete Cosine Transform (DCT) applied to the whole word image. Feature vectors extracted using block-based DCT with non-overlapping sub-block of size 8x8 pixels. The applied HMMs to the task are mono-model discrete one-dimensional HMMs (Bakis Model). A balanced actual scanned and synthetic database of word-image has been constructed to ensure an even distribution of word samples. The Arabic words are typewritten in five fonts having a size 14 points in a plain style. The statistical language models and lexicon words are extracted from The Holy Qur‟an. The systems are applied on word images with no overlap between the training and testing datasets. The actual scanned database is used to evaluate the word recogniser. The synthetic database is a large amount of data acquired for a reliable training of sentence recognition systems. This word recogniser evaluated in mono-font and multi-font contexts. The two types of word recogniser have been used to achieve a final recognition accuracy of99.30% and 73.47% in mono-font and multi-font, respectively. The achieved average accuracy by the sentence recogniser is 67.24% improved to 78.35% on average when using 5-gram post-processing. The complexity and accuracy of the post-processing module are evaluated and found that 4-gram is more suitable than 5-gram; it is much faster at an average improvement of 76.89%
Bernoulli HMMs for Handwritten Text Recognition
In last years Hidden Markov Models (HMMs) have received significant attention in the
task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR),
HMMs are used to model the probability of an observation sequence, given its corresponding
text transcription. However, in contrast to what happens in ASR, in HTR there is no standard
set of local features being used by most of the proposed systems. In this thesis we propose the
use of raw binary pixels as features, in conjunction with models that deal more directly with
the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional
HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli
(mixture) probability functions. The objective is twofold: on the one hand, this allows us
to better modeling the binary nature of text images (foreground/background) using BHMMs.
On the other hand, this guarantees that no discriminative information is filtered out during
feature extraction (most HTR available datasets can be easily binarized without a relevant
loss of information).
In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is
reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple
classifier based on BHMMs with Bernoulli probability functions at the states, and we end
with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the
binary features, we propose a simple binary feature extraction process without significant
loss of information. All input images are scaled and binarized, in order to easily reinterpret
them as sequences of binary feature vectors. Two extensions are proposed to this basic feature
extraction method: the use of a sliding window in order to better capture the context,
and a repositioning method in order to better deal with vertical distortions. Competitive results
were obtained when BHMMs and proposed methods were applied to well-known HTR
databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition
organized during the 12th International Conference on Frontiers in Handwriting Recognition
(ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally
Represented Text organized during the 11th International Conference on Document Analysis
and Recognition (ICDAR 2011).
In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the
task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR),
HMMs are used to model the probability of an observation sequence, given its corresponding
text transcription. However, in contrast to what happens in ASR, in HTR there is no standard
set of local features being used by most of the proposed systems. In this thesis we propose the
use of raw binary pixels as features, in conjunction with models that deal more directly with
the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional
HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli
(mixture) probability functions. The objective is twofold: on the one hand, this allows us
to better modeling the binary nature of text images (foreground/background) using BHMMs.
On the other hand, this guarantees that no discriminative information is filtered out during
feature extraction (most HTR available datasets can be easily binarized without a relevant
loss of information).
In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is
reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple
classifier based on BHMMs with Bernoulli probability functions at the states, and we end
with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the
binary features, we propose a simple binary feature extraction process without significant
loss of information. All input images are scaled and binarized, in order to easily reinterpret
them as sequences of binary feature vectors. Two extensions are proposed to this basic feature
extraction method: the use of a sliding window in order to better capture the context,
and a repositioning method in order to better deal with vertical distortions. Competitive results
were obtained when BHMMs and proposed methods were applied to well-known HTR
databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition
organized during the 12th International Conference on Frontiers in Handwriting Recognition
(ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally
Represented Text organized during the 11th International Conference on Document Analysis
and Recognition (ICDAR 2011).
In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the
task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR),
HMMs are used to model the probability of an observation sequence, given its corresponding
text transcription. However, in contrast to what happens in ASR, in HTR there is no standard
set of local features being used by most of the proposed systems. In this thesis we propose the
use of raw binary pixels as features, in conjunction with models that deal more directly with
the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional
HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli
(mixture) probability functions. The objective is twofold: on the one hand, this allows us
to better modeling the binary nature of text images (foreground/background) using BHMMs.
On the other hand, this guarantees that no discriminative information is filtered out during
feature extraction (most HTR available datasets can be easily binarized without a relevant
loss of information).
In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is
reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple
classifier based on BHMMs with Bernoulli probability functions at the states, and we end
with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the
binary features, we propose a simple binary feature extraction process without significant
loss of information. All input images are scaled and binarized, in order to easily reinterpret
them as sequences of binary feature vectors. Two extensions are proposed to this basic feature
extraction method: the use of a sliding window in order to better capture the context,
and a repositioning method in order to better deal with vertical distortions. Competitive results
were obtained when BHMMs and proposed methods were applied to well-known HTR
databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition
organized during the 12th International Conference on Frontiers in Handwriting Recognition
(ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally
Represented Text organized during the 11th International Conference on Document Analysis
and Recognition (ICDAR 2011).
In the last part of this thesis we propose a method for training BHMM classifiers using discriminative training criteria, instead of the conventionalMaximum Likelihood Estimation
(MLE). Specifically, we propose a log-linear classifier for binary data based on the BHMM
classifier. Parameter estimation of this model can be carried out using discriminative training
criteria for log-linear models. In particular, we show the formulae for several MMI based
criteria. Finally, we prove the equivalence between both classifiers, hence, discriminative
training of a BHMM classifier can be carried out by obtaining its equivalent log-linear classifier.
Reported results show that discriminative BHMMs clearly outperform conventional
generative BHMMs.Giménez Pastor, A. (2014). Bernoulli HMMs for Handwritten Text Recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37978TESI
Predicting user behavior using data profiling and hidden Markov model
Mental health disorders affect many aspects of patient’s lives, including emotions, cognition, and especially behaviors. E-health technology helps to collect information wealth in a non-invasive manner, which represents a promising opportunity to construct health behavior markers. Combining such user behavior data can provide a more comprehensive and contextual view than questionnaire data. Due to behavioral data, we can train machine learning models to understand the data pattern and also use prediction algorithms to know the next state of a person’s behavior. The remaining challenges for this issue are how to apply mathematical formulations to textual datasets and find metadata that aids to identify the person’s life pattern and also predict the next state of his comportment. The main idea of this work is to use a hidden Markov model (HMM) to predict user behavior from social media applications by analyzing and detecting states and symbols from the user behavior dataset. To achieve this goal, we need to analyze and detect the states and symbols from the user behavior dataset, then convert the textual data to mathematical and numerical matrices. Finally, apply the HMM model to predict the hidden user behavior states. We tested our program and identified that the log-likelihood was higher and better when the model fits the data. In any case, the results of the study indicated that the program was suitable for the purpose and yielded valuable data
- …