2,724 research outputs found

    Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation

    Full text link
    Feature selection (FS) has become an indispensable task in dealing with today's highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary simultaneous perturbation stochastic approximation (BSPSA). This pseudo-gradient descent stochastic algorithm starts with an initial feature vector and moves toward the optimal feature vector via successive iterations. In each iteration, the current feature vector's individual components are perturbed simultaneously by random offsets from a qualified probability distribution. We present computational experiments on datasets with numbers of features ranging from a few dozens to thousands using three widely-used classifiers as wrappers: nearest neighbor, decision tree, and linear support vector machine. We compare our methodology against the full set of features as well as a binary genetic algorithm and sequential FS methods using cross-validated classification error rate and AUC as the performance criteria. Our results indicate that features selected by BSPSA compare favorably to alternative methods in general and BSPSA can yield superior feature sets for datasets with tens of thousands of features by examining an extremely small fraction of the solution space. We are not aware of any other wrapper FS methods that are computationally feasible with good convergence properties for such large datasets.Comment: This is the Istanbul Sehir University Technical Report #SHR-ISE-2016.01. A short version of this report has been accepted for publication at Pattern Recognition Letter

    Copasetic analysis: a framework for the blind analysis of microarray imagery

    Get PDF
    The official published version can be found at the link below.From its conception, bioinformatics has been a multidisciplinary field which blends domain expert knowledge with new and existing processing techniques, all of which are focused on a common goal. Typically, these techniques have focused on the direct analysis of raw microarray image data. Unfortunately, this fails to utilise the image's full potential and in practice, this results in the lab technician having to guide the analysis algorithms. This paper presents a dynamic framework that aims to automate the process of microarray image analysis using a variety of techniques. An overview of the entire framework process is presented, the robustness of which is challenged throughout with a selection of real examples containing varying degrees of noise. The results show the potential of the proposed framework in its ability to determine slide layout accurately and perform analysis without prior structural knowledge. The algorithm achieves approximately, a 1 to 3 dB improved peak signal-to-noise ratio compared to conventional processing techniques like those implemented in GenePixÂź when used by a trained operator. As far as the authors are aware, this is the first time such a comprehensive framework concept has been directly applied to the area of microarray image analysis

    Machine learning methods for histopathological image analysis

    Full text link
    Abundant accumulation of digital histopathological images has led to the increased demand for their analysis, such as computer-aided diagnosis using machine learning techniques. However, digital pathological images and related tasks have some issues to be considered. In this mini-review, we introduce the application of digital pathological image analysis using machine learning algorithms, address some problems specific to such analysis, and propose possible solutions.Comment: 23 pages, 4 figure

    Multiple Feature Fuzzy c-means Clustering Algorithm for Segmentation of Microarray Images

    Get PDF
    Microarray technology allows the simultaneous monitoring of thousands of genes. Based on the gene expression measurements, microarray technology have proven powerful in gene expression profiling for discovering new types of diseases and for predicting the type of a disease. Gridding, segmentation and intensity extraction are the three important steps in microarray image analysis. Clustering algorithms have been used for microarray image segmentation with an advantage that they are not restricted to a particular shape and size for the spots. Instead of using single feature clustering algorithm, this paper presents multiple feature clustering algorithm with three features for each pixel such as pixel intensity, distance from the center of the spot and median of surrounding pixels. In all the traditional clustering algorithms, number of clusters and initial centroids are randomly selected and often specified by the user.  In this paper, a new algorithm based on empirical mode decomposition algorithm for the histogram of the input image will generate the number of clusters and initial centroids required for clustering.   It overcomes the shortage of random initialization in traditional clustering and achieves high computational speed by reducing the number of iterations. The experimental results show that multiple feature Fuzzy C-means has segmented the microarray image more accurately than other algorithms

    Automatic gridding of DNA microarray images.

    Get PDF
    Microarray (DNA chip) technology is having a significant impact on genomic studies. Many fields, including drug discovery and toxicological research, will certainly benefit from the use of DNA microarray technology. Microarray analysis is replacing traditional biological assays based on gels, filters and purification columns with small glass chips containing tens of thousands of DNA and protein sequences in agricultural and medical sciences. Microarray functions like biological microprocessors, enabling the rapid and quantitative analysis of gene expression patterns, patient genotypes, drug mechanisms and disease onset and progression on a genomic scale. Image analysis and statistical analysis are two important aspects of microarray technology. Gridding is necessary to accurately identify the location of each of the spots while extracting spot intensities from the microarray images and automating this procedure permits high-throughput analysis. Due to the deficiencies of the equipment that is used to print the arrays, rotations, misalignments, high contaminations with noise and artifacts, solving the grid segmentation problem in an automatic system is not trivial. The existing techniques to solve the automatic grid segmentation problem cover only limited aspect of this challenging problem and requires the user to specify or make assumptions about the spotsize, rows and columns in the grid and boundary conditions. An automatic gridding and spot quantification technique is proposed, which takes a matrix of pixels or a microarray image as input and makes no assumptions about the spotsize, rows and columns in the grid and is found to effective on datasets from GEO, Stanford genomic laboratories and on images obtained from private repositories. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .V53. Source: Masters Abstracts International, Volume: 43-03, page: 0891. Adviser: Luis Rueda. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    A New Method of Gridding for Spot Detection in Microarray Images

    Get PDF
    A Deoxyribonucleic Acid (DNA) microarray is a collection of microscopic DNA spots attached to a solid surface, such as glass, plastic or silicon chip forming an array. The analysis of DNA microarray images allows the identification of gene expressions to draw biological conclusions for applications ranging from genetic profiling to diagnosis of cancer. The DNA microarray image analysis includes three tasks: gridding, segmentation and intensity extraction. The gridding process is usually divided into two main steps: sub-gridding and spot detection. In this paper, a fully automatic approach to detect the location of spots is proposed. Each spot is associated with a gene and contains the pixels that indicate the level of expression of that particular gene. After gridding, the image is segmented using fuzzy c-means clustering algorithm for separation of spots from the background pixels.  The result of the experiment shows that the method presented in this paper is accurate and automatic without human intervention and parameter presetting. Keywords: Microarray Image, Mathematical Morphology, Image Processin
    • 

    corecore