21,865 research outputs found

    MDL Denoising Revisited

    Full text link
    We refine and extend an earlier MDL denoising criterion for wavelet-based denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and non-informative wavelet coefficients, respectively. This suggests two refinements, adding a code-length for the model index, and extending the model in order to account for subband-dependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200

    SMART: Unique splitting-while-merging framework for gene clustering

    Get PDF
    Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc

    An integrated search-based approach for automatic testing from extended finite state machine (EFSM) models

    Get PDF
    This is the post-print version of the Article - Copyright @ 2011 ElsevierThe extended finite state machine (EFSM) is a modelling approach that has been used to represent a wide range of systems. When testing from an EFSM, it is normal to use a test criterion such as transition coverage. Such test criteria are often expressed in terms of transition paths (TPs) through an EFSM. Despite the popularity of EFSMs, testing from an EFSM is difficult for two main reasons: path feasibility and path input sequence generation. The path feasibility problem concerns generating paths that are feasible whereas the path input sequence generation problem is to find an input sequence that can traverse a feasible path. While search-based approaches have been used in test automation, there has been relatively little work that uses them when testing from an EFSM. In this paper, we propose an integrated search-based approach to automate testing from an EFSM. The approach has two phases, the aim of the first phase being to produce a feasible TP (FTP) while the second phase searches for an input sequence to trigger this TP. The first phase uses a Genetic Algorithm whose fitness function is a TP feasibility metric based on dataflow dependence. The second phase uses a Genetic Algorithm whose fitness function is based on a combination of a branch distance function and approach level. Experimental results using five EFSMs found the first phase to be effective in generating FTPs with a success rate of approximately 96.6%. Furthermore, the proposed input sequence generator could trigger all the generated feasible TPs (success rate = 100%). The results derived from the experiment demonstrate that the proposed approach is effective in automating testing from an EFSM
    • …
    corecore