43 research outputs found

    Blind image separation based on exponentiated transmuted Weibull distribution

    Full text link
    In recent years the processing of blind image separation has been investigated. As a result, a number of feature extraction algorithms for direct application of such image structures have been developed. For example, separation of mixed fingerprints found in any crime scene, in which a mixture of two or more fingerprints may be obtained, for identification, we have to separate them. In this paper, we have proposed a new technique for separating a multiple mixed images based on exponentiated transmuted Weibull distribution. To adaptively estimate the parameters of such score functions, an efficient method based on maximum likelihood and genetic algorithm will be used. We also calculate the accuracy of this proposed distribution and compare the algorithmic performance using the efficient approach with other previous generalized distributions. We find from the numerical results that the proposed distribution has flexibility and an efficient resultComment: 14 pages, 12 figures, 4 tables. International Journal of Computer Science and Information Security (IJCSIS),Vol. 14, No. 3, March 2016 (pp. 423-433

    Centroid-Based Clustering with ab-Divergences

    Get PDF
    Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of ab-divergences, which is governed by two parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applications.MINECO TEC2017-82807-

    Centroid-Based Clustering with αβ-Divergences

    Get PDF
    Article number 196Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of αβ-divergences, which is governed by two parameters, α and β. We propose a new iterative algorithm, αβ-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (α, β). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (α, β) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applicationsMinisterio de Economía y Competitividad de España (MINECO) TEC2017-82807-

    ICA and Sparse ICA for Biomedical Signals

    Get PDF
    Biomedical signs or bio signals are a wide range of signals obtained from the human body that can be at the cell organ or sub-atomic level Electromyogram refers to electrical activity from muscle sound signals electroencephalogram refers to electrical activity from the encephalon electrocardiogram refers to electrical activity from the heart electroretinogram refers to electrical activity from the eye and so on Monitoring and observing changes in these signals assist physicians whose work is related to this branch of medicine in covering predicting and curing various diseases It can also assist physicians in examining prognosticating and curing numerous condition

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    Motion-capture-based hand gesture recognition for computing and control

    Get PDF
    This dissertation focuses on the study and development of algorithms that enable the analysis and recognition of hand gestures in a motion capture environment. Central to this work is the study of unlabeled point sets in a more abstract sense. Evaluations of proposed methods focus on examining their generalization to users not encountered during system training. In an initial exploratory study, we compare various classification algorithms based upon multiple interpretations and feature transformations of point sets, including those based upon aggregate features (e.g. mean) and a pseudo-rasterization of the capture space. We find aggregate feature classifiers to be balanced across multiple users but relatively limited in maximum achievable accuracy. Certain classifiers based upon the pseudo-rasterization performed best among tested classification algorithms. We follow this study with targeted examinations of certain subproblems. For the first subproblem, we introduce the a fortiori expectation-maximization (AFEM) algorithm for computing the parameters of a distribution from which unlabeled, correlated point sets are presumed to be generated. Each unlabeled point is assumed to correspond to a target with independent probability of appearance but correlated positions. We propose replacing the expectation phase of the algorithm with a Kalman filter modified within a Bayesian framework to account for the unknown point labels which manifest as uncertain measurement matrices. We also propose a mechanism to reorder the measurements in order to improve parameter estimates. In addition, we use a state-of-the-art Markov chain Monte Carlo sampler to efficiently sample measurement matrices. In the process, we indirectly propose a constrained k-means clustering algorithm. Simulations verify the utility of AFEM against a traditional expectation-maximization algorithm in a variety of scenarios. In the second subproblem, we consider the application of positive definite kernels and the earth mover\u27s distance (END) to our work. Positive definite kernels are an important tool in machine learning that enable efficient solutions to otherwise difficult or intractable problems by implicitly linearizing the problem geometry. We develop a set-theoretic interpretation of ENID and propose earth mover\u27s intersection (EMI). a positive definite analog to ENID. We offer proof of EMD\u27s negative definiteness and provide necessary and sufficient conditions for ENID to be conditionally negative definite, including approximations that guarantee negative definiteness. In particular, we show that ENID is related to various min-like kernels. We also present a positive definite preserving transformation that can be applied to any kernel and can be used to derive positive definite EMD-based kernels, and we show that the Jaccard index is simply the result of this transformation applied to set intersection. Finally, we evaluate kernels based on EMI and the proposed transformation versus ENID in various computer vision tasks and show that END is generally inferior even with indefinite kernel techniques. Finally, we apply deep learning to our problem. We propose neural network architectures for hand posture and gesture recognition from unlabeled marker sets in a coordinate system local to the hand. As a means of ensuring data integrity, we also propose an extended Kalman filter for tracking the rigid pattern of markers on which the local coordinate system is based. We consider fixed- and variable-size architectures including convolutional and recurrent neural networks that accept unlabeled marker input. We also consider a data-driven approach to labeling markers with a neural network and a collection of Kalman filters. Experimental evaluations with posture and gesture datasets show promising results for the proposed architectures with unlabeled markers, which outperform the alternative data-driven labeling method

    Group-structured and independent subspace based dictionary learning

    Get PDF
    Thanks to the several successful applications, sparse signal representation has become one of the most actively studied research areas in mathematics. However, in the traditional sparse coding problem the dictionary used for representation is assumed to be known. In spite of the popularity of sparsity and its recently emerged structured sparse extension, interestingly, very few works focused on the learning problem of dictionaries to these codes. In the first part of the paper, we develop a dictionary learning method which is (i) online, (ii) enables overlapping group structures with (iii) non-convex sparsity-inducing regularization and (iv) handles the partially observable case. To the best of our knowledge, current methods can exhibit two of these four desirable properties at most. We also investigate several interesting special cases of our framework and demonstrate its applicability in inpainting of natural signals, structured sparse non-negative matrix factorization of faces and collaborative filtering. Complementing the sparse direction we formulate a novel component-wise acting, epsilon-sparse coding scheme in reproducing kernel Hilbert spaces and show its equivalence to a generalized class of support vector machines. Moreover, we embed support vector machines to multilayer perceptrons and show that for this novel kernel based approximation approach the backpropagation procedure of multilayer perceptrons can be generalized. In the second part of the paper, we focus on dictionary learning making use of independent subspace assumption instead of structured sparsity. The corresponding problem is called independent subspace analysis (ISA), or independent component analysis (ICA) if all the hidden, independent sources are one-dimensional. One of the most fundamental results of this research field is the ISA separation principle, which states that the ISA problem can be solved by traditional ICA up to permutation. This principle (i) forms the basis of the state-of-the-art ISA solvers and (ii) enables one to estimate the unknown number and the dimensions of the sources efficiently. We (i) extend the ISA problem to several new directions including the controlled, the partially observed, the complex valued and the nonparametric case and (ii) derive separation principle based solution techniques for the generalizations. This solution approach (i) makes it possible to apply state-of-the-art algorithms for the obtained subproblems (in the ISA example ICA and clustering) and (ii) handles the case of unknown dimensional sources. Our extensive numerical experiments demonstrate the robustness and efficiency of our approach
    corecore