33,203 research outputs found
Exploring Interpretable LSTM Neural Networks over Multi-Variable Data
For recurrent neural networks trained on time series with target and
exogenous variables, in addition to accurate prediction, it is also desired to
provide interpretable insights into the data. In this paper, we explore the
structure of LSTM recurrent neural networks to learn variable-wise hidden
states, with the aim to capture different dynamics in multi-variable time
series and distinguish the contribution of variables to the prediction. With
these variable-wise hidden states, a mixture attention mechanism is proposed to
model the generative process of the target. Then we develop associated training
methods to jointly learn network parameters, variable and temporal importance
w.r.t the prediction of the target variable. Extensive experiments on real
datasets demonstrate enhanced prediction performance by capturing the dynamics
of different variables. Meanwhile, we evaluate the interpretation results both
qualitatively and quantitatively. It exhibits the prospect as an end-to-end
framework for both forecasting and knowledge extraction over multi-variable
data.Comment: Accepted to International Conference on Machine Learning (ICML), 201
Inferring causal relations from multivariate time series : a fast method for large-scale gene expression data
Various multivariate time series analysis techniques have been developed with the aim of inferring causal relations between time series. Previously, these techniques have proved their effectiveness on economic and neurophysiological data, which normally consist of hundreds of samples. However, in their applications to gene regulatory inference, the small sample size of gene expression time series poses an obstacle. In this paper, we describe some of the most commonly used multivariate inference techniques and show the potential challenge related to gene expression analysis. In response, we propose a directed partial correlation (DPC) algorithm as an efficient and effective solution to causal/regulatory relations inference on small sample gene expression data. Comparative evaluations on the existing techniques and the proposed method are presented. To draw reliable conclusions, a comprehensive benchmarking on data sets of various setups is essential. Three experiments are designed to assess these methods in a coherent manner. Detailed analysis of experimental results not only reveals good accuracy of the proposed DPC method in large-scale prediction, but also gives much insight into all methods under evaluation
Data-driven design of intelligent wireless networks: an overview and tutorial
Data science or "data-driven research" is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves
Real-valued feature selection for process approximation and prediction
The selection of features for classification, clustering and approximation is an important task in pattern recognition, data mining and soft computing. For real-valued features, this contribution shows how feature selection for a high number of features can be implemented using mutual in-formation. Especially, the common problem for mutual information computation of computing joint probabilities for many dimensions using only a few samples is treated by using the Rènyi mutual information of order two as computational base. For this, the Grassberger-Takens corre-lation integral is used which was developed for estimating probability densities in chaos theory. Additionally, an adaptive procedure for computing the hypercube size is introduced and for real world applications, the treatment of missing values is included. The computation procedure is accelerated by exploiting the ranking of the set of real feature values especially for the example of time series. As example, a small blackbox-glassbox example shows how the relevant features and their time lags are determined in the time series even if the input feature time series determine nonlinearly the output. A more realistic example from chemical industry shows that this enables a better ap-proximation of the input-output mapping than the best neural network approach developed for an international contest. By the computationally efficient implementation, mutual information becomes an attractive tool for feature selection even for a high number of real-valued features
- …