15,423 research outputs found
Temporal and Spatial Data Mining with Second-Order Hidden Models
In the frame of designing a knowledge discovery system, we have developed
stochastic models based on high-order hidden Markov models. These models are
capable to map sequences of data into a Markov chain in which the transitions
between the states depend on the \texttt{n} previous states according to the
order of the model. We study the process of achieving information extraction
fromspatial and temporal data by means of an unsupervised classification. We
use therefore a French national database related to the land use of a region,
named Teruti, which describes the land use both in the spatial and temporal
domain. Land-use categories (wheat, corn, forest, ...) are logged every year on
each site regularly spaced in the region. They constitute a temporal sequence
of images in which we look for spatial and temporal dependencies. The temporal
segmentation of the data is done by means of a second-order Hidden Markov Model
(\hmmd) that appears to have very good capabilities to locate stationary
segments, as shown in our previous work in speech recognition. Thespatial
classification is performed by defining a fractal scanning ofthe images with
the help of a Hilbert-Peano curve that introduces atotal order on the sites,
preserving the relation ofneighborhood between the sites. We show that the
\hmmd performs aclassification that is meaningful for the agronomists.Spatial
and temporal classification may be achieved simultaneously by means of a 2
levels \hmmd that measures the \aposteriori probability to map a temporal
sequence of images onto a set of hidden classes
Learning Tree Distributions by Hidden Markov Models
Hidden tree Markov models allow learning distributions for tree structured
data while being interpretable as nondeterministic automata. We provide a
concise summary of the main approaches in literature, focusing in particular on
the causality assumptions introduced by the choice of a specific tree visit
direction. We will then sketch a novel non-parametric generalization of the
bottom-up hidden tree Markov model with its interpretation as a
nondeterministic tree automaton with infinite states.Comment: Accepted in LearnAut2018 worksho
Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art
Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover
Assessing Crash Risks on Curves
In Queensland, curve related crashes contributed to 63.44% of fatalities, and 25.17% required hospitalisation. In addition, 51.1% of run-off-road crashes occurred on obscured or open-view road curves (Queensland Transport, 2006). This paper presents a conceptual framework for an in-vehicle system, which assesses crash risk when a driver is manoeuvring on a curve. Our approach consists of using Intelligent Transport Systems (ITS) to collect information about the driving context. The driving context corresponds to information about the environment, driver, and vehicle gathered from sensor technology. Sensors are useful to detect drivers’ high-risk situations such as curves, fogs, drivers’ fatigue or slippery roads. However, sensors can be unreliable, and therefore the information gathered from them can be incomplete or inaccurate. In order to improve the accuracy, a system is built to perform information fusion from past and current driving information. The integrated information is analysed using ubiquitous data mining techniques and the results are later used in a Coupled Hidden Markov Model (CHMM), to learn and classify the information into different risk categories. CHMM is used to predict the probability of crash on curves. Based on the risk assessment, our system provides appropriate intervention to the driver. This approach could allow the driver to have sufficient time to react promptly. Hence, this could potentially promote safe driving and decrease curve related injuries and fatalities
Recommended from our members
Sequence Classification Restricted Boltzmann Machines With Gated Units
For the classification of sequential data, dynamic Bayesian networks and recurrent neural networks (RNNs) are the preferred models. While the former can explicitly model the temporal dependences between the variables, and the latter have the capability of learning representations. The recurrent temporal restricted Boltzmann machine (RTRBM) is a model that combines these two features. However, learning and inference in RTRBMs can be difficult because of the exponential nature of its gradient computations when maximizing log likelihoods. In this article, first, we address this intractability by optimizing a conditional rather than a joint probability distribution when performing sequence classification. This results in the ``sequence classification restricted Boltzmann machine'' (SCRBM). Second, we introduce gated SCRBMs (gSCRBMs), which use an information processing gate, as an integration of SCRBMs with long short-term memory (LSTM) models. In the experiments reported in this article, we evaluate the proposed models on optical character recognition, chunking, and multiresident activity recognition in smart homes. The experimental results show that gSCRBMs achieve the performance comparable to that of the state of the art in all three tasks. gSCRBMs require far fewer parameters in comparison with other recurrent networks with memory gates, in particular, LSTMs and gated recurrent units (GRUs)
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
- …