406,507 research outputs found

    Design-time Models for Resiliency

    Get PDF
    Resiliency in process-aware information systems is based on the availability of recovery flows and alternative data for coping with missing data. In this paper, we discuss an approach to process and information modeling to support the specification of recovery flows and alternative data. In particular, we focus on processes using sensor data from different sources. The proposed model can be adopted to specify resiliency levels of information systems, based on event-based and temporal constraints

    Energy-based temporal neural networks for imputing missing values

    Get PDF
    Imputing missing values in high dimensional time series is a difficult problem. There have been some approaches to the problem [11,8] where neural architectures were trained as probabilistic models of the data. However, we argue that this approach is not optimal. We propose to view temporal neural networks with latent variables as energy-based models and train them for missing value recovery directly. In this paper we introduce two energy-based models. The first model is based on a one dimensional convolution and the second model utilizes a recurrent neural network. We demonstrate how ideas from the energy-based learning framework can be used to train these models to recover missing values. The models are evaluated on a motion capture dataset

    Sparsity-Based Spatial Interpolation in Wireless Sensor Networks

    Get PDF
    In wireless sensor networks, due to environmental limitations or bad wireless channel conditions, not all sensor samples can be successfully gathered at the sink. In this paper, we try to recover these missing samples without retransmission. The missing samples estimation problem is mathematically formulated as a 2-D spatial interpolation. Assuming the 2-D sensor data can be sparsely represented by a dictionary, a sparsity-based recovery approach by solving for l1 norm minimization is proposed. It is shown that these missing samples can be reasonably recovered based on the null space property of the dictionary. This property also points out the way to choose an appropriate sparsifying dictionary to further reduce the recovery errors. The simulation results on synthetic and real data demonstrate that the proposed approach can recover the missing data reasonably well and that it outperforms the weighted average interpolation methods when the data change relatively fast or blocks of samples are lost. Besides, there exists a range of missing rates where the proposed approach is robust to missing block sizes

    A Pseudo Nearest-Neighbor Approach for Missing Data Recovery on Gaussian Random Data Sets

    Get PDF
    Missing data handling is an important preparation step for most data discrimination or mining tasks. Inappropriate treatment of missing data may cause large errors or false results. In this paper, we study the effect of a missing data recovery method, namely the pseudo- nearest neighbor substitution approach, on Gaussian distributed data sets that represent typical cases in data discrimination and data mining applications. The error rate of the proposed recovery method is evaluated by comparing the clustering results of the recovered data sets to the clustering results obtained on the originally complete data sets. The results are also compared with that obtained by applying two other missing data handling methods, the constant default value substitution and the missing data ignorance (non-substitution) methods. The experiment results provided a valuable insight to the improvement of the accuracy for data discrimination and knowledge discovery on large data sets containing missing values

    Analysing mark-recapture-recovery data in the presence of missing covariate data via multiple imputation

    Get PDF
    We consider mark–recapture–recovery data with additional individual time-varying continuous covariate data. For such data it is common to specify the model parameters, and in particular the survival probabilities, as a function of these covariates to incorporate individual heterogeneity. However, an issue arises in relation to missing covariate values, for (at least) the times when an individual is not observed, leading to an analytically intractable likelihood. We propose a two-step multiple imputation approach to obtain estimates of the demographic parameters. Firstly, a model is fitted to only the observed covariate values. Conditional on the fitted covariate model, multiple “complete” datasets are generated (i.e. all missing covariate values are imputed). Secondly, for each complete dataset, a closed form complete data likelihood can be maximised to obtain estimates of the model parameters which are subsequently combined to obtain an overall estimate of the parameters. Associated standard errors and 95 % confidence intervals are obtained using a non-parametric bootstrap. A simulation study is undertaken to assess the performance of the proposed two-step approach. We apply the method to data collected on a well-studied population of Soay sheep and compare the results with a Bayesian data augmentation approach. Supplementary materials accompanying this paper appear on-line.Publisher PDFPeer reviewe
    • …
    corecore