54 research outputs found

    A fully automatic gridding method for cDNA microarray images

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Processing cDNA microarray images is a crucial step in gene expression analysis, since any errors in early stages affect subsequent steps, leading to possibly erroneous biological conclusions. When processing the underlying images, accurately separating the sub-grids and spots is extremely important for subsequent steps that include segmentation, quantification, normalization and clustering.</p> <p>Results</p> <p>We propose a parameterless and fully automatic approach that first detects the sub-grids given the entire microarray image, and then detects the locations of the spots in each sub-grid. The approach, first, detects and corrects rotations in the images by applying an affine transformation, followed by a polynomial-time optimal multi-level thresholding algorithm used to find the positions of the sub-grids in the image and the positions of the spots in each sub-grid. Additionally, a new validity index is proposed in order to find the correct number of sub-grids in the image, and the correct number of spots in each sub-grid. Moreover, a refinement procedure is used to correct possible misalignments and increase the accuracy of the method.</p> <p>Conclusions</p> <p>Extensive experiments on real-life microarray images and a comparison to other methods show that the proposed method performs these tasks fully automatically and with a very high degree of accuracy. Moreover, unlike previous methods, the proposed approach can be used in various type of microarray images with different resolutions and spot sizes and does not need any parameter to be adjusted.</p

    M3G: Maximum Margin Microarray Gridding

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Complementary DNA (cDNA) microarrays are a well established technology for studying gene expression. A microarray image is obtained by laser scanning a hybridized cDNA microarray, which consists of thousands of spots representing chains of cDNA sequences, arranged in a two-dimensional array. The separation of the spots into distinct cells is widely known as microarray image gridding.</p> <p>Methods</p> <p>In this paper we propose M<sup>3</sup>G, a novel method for automatic gridding of cDNA microarray images based on the maximization of the margin between the rows and the columns of the spots. Initially the microarray image rotation is estimated and then a pre-processing algorithm is applied for a rough spot detection. In order to diminish the effect of artefacts, only a subset of the detected spots is selected by matching the distribution of the spot sizes to the normal distribution. Then, a set of grid lines is placed on the image in order to separate each pair of consecutive rows and columns of the selected spots. The optimal positioning of the lines is determined by maximizing the margin between these rows and columns by using a maximum margin linear classifier, effectively facilitating the localization of the spots.</p> <p>Results</p> <p>The experimental evaluation was based on a reference set of microarray images containing more than two million spots in total. The results show that M<sup>3</sup>G outperforms state of the art methods, demonstrating robustness in the presence of noise and artefacts. More than 98% of the spots reside completely inside their respective grid cells, whereas the mean distance between the spot center and the grid cell center is 1.2 pixels.</p> <p>Conclusions</p> <p>The proposed method performs highly accurate gridding in the presence of noise and artefacts, while taking into account the input image rotation. Thus, it provides the potential of achieving perfect gridding for the vast majority of the spots.</p

    Bioinformatics framework for genotyping microarray data analysis

    Get PDF
    Functional genomics is a flourishing science enabled by recent technological breakthroughs in high-throughput instrumentation and microarray data analysis. Genotyping microarrays establish the genotypes of DNA sequences containing single nucleotide polymorphisms (SNPs), and can help biologists probe the functions of different genes and/or construct complex gene interaction networks. The enormous amount of data from these experiments makes it infeasible to perform manual processing to obtain accurate and reliable results in daily routines. Advanced algorithms as well as an integrated software toolkit are needed to help perform reliable and fast data analysis. The author developed a MatlabTM based software package, called TIMDA (a Toolkit for Integrated Genotyping Microarray Data Analysis), for fully automatic, accurate and reliable genotyping microarray data analysis. The author also developed new algorithms for image processing and genotype-calling. The modular design of TIMDA allows satisfactory extensibility and maintainability. TIMDA is open source (URL: http://timda.SF.net and can be easily customized by users to meet their particular needs. The quality and reproducibility of results in image processing and genotype-calling and the ease of customization indicate that TIMDA is a useful package for genomics research

    Automatic gridding of DNA microarray images.

    Get PDF
    Microarray (DNA chip) technology is having a significant impact on genomic studies. Many fields, including drug discovery and toxicological research, will certainly benefit from the use of DNA microarray technology. Microarray analysis is replacing traditional biological assays based on gels, filters and purification columns with small glass chips containing tens of thousands of DNA and protein sequences in agricultural and medical sciences. Microarray functions like biological microprocessors, enabling the rapid and quantitative analysis of gene expression patterns, patient genotypes, drug mechanisms and disease onset and progression on a genomic scale. Image analysis and statistical analysis are two important aspects of microarray technology. Gridding is necessary to accurately identify the location of each of the spots while extracting spot intensities from the microarray images and automating this procedure permits high-throughput analysis. Due to the deficiencies of the equipment that is used to print the arrays, rotations, misalignments, high contaminations with noise and artifacts, solving the grid segmentation problem in an automatic system is not trivial. The existing techniques to solve the automatic grid segmentation problem cover only limited aspect of this challenging problem and requires the user to specify or make assumptions about the spotsize, rows and columns in the grid and boundary conditions. An automatic gridding and spot quantification technique is proposed, which takes a matrix of pixels or a microarray image as input and makes no assumptions about the spotsize, rows and columns in the grid and is found to effective on datasets from GEO, Stanford genomic laboratories and on images obtained from private repositories. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .V53. Source: Masters Abstracts International, Volume: 43-03, page: 0891. Adviser: Luis Rueda. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    MASQOT: a method for cDNA microarray spot quality control

    Get PDF
    BACKGROUND: cDNA microarray technology has emerged as a major player in the parallel detection of biomolecules, but still suffers from fundamental technical problems. Identifying and removing unreliable data is crucial to prevent the risk of receiving illusive analysis results. Visual assessment of spot quality is still a common procedure, despite the time-consuming work of manually inspecting spots in the range of hundreds of thousands or more. RESULTS: A novel methodology for cDNA microarray spot quality control is outlined. Multivariate discriminant analysis was used to assess spot quality based on existing and novel descriptors. The presented methodology displays high reproducibility and was found superior in identifying unreliable data compared to other evaluated methodologies. CONCLUSION: The proposed methodology for cDNA microarray spot quality control generates non-discrete values of spot quality which can be utilized as weights in subsequent analysis procedures as well as to discard spots of undesired quality using the suggested threshold values. The MASQOT approach provides a consistent assessment of spot quality and can be considered an alternative to the labor-intensive manual quality assessment process

    Novel pattern recognition approaches for transcriptomics data analysis

    Get PDF
    We proposed a family of methods for transcriptomics and genomics data analysis based on multi-level thresholding approach, such as OMTG for sub-grid and spot detection in DNA microarrays, and OMT for detecting significant regions based on next generation sequencing data. Extensive experiments on real-life datasets and a comparison to other methods show that the proposed methods perform these tasks fully automatically and with a very high degree of accuracy. Moreover, unlike previous methods, the proposed approaches can be used in various types of transcriptome analysis problems such as microarray image gridding with different resolutions and spot sizes as well as finding the interacting regions of DNA with a protein of interest using ChIP-Seq data without any need for parameter adjustment. We also developed constrained multi-level thresholding (CMT), an algorithm used to detect enriched regions on ChIP-Seq data with the ability of targeting regions within a specific range. We show that CMT has higher accuracy in detecting enriched regions (peaks) by objectively assessing its performance relative to other previously proposed peak finders. This is shown by testing three algorithms on the well-known FoxA1 Data set, four transcription factors (with a total of six antibodies) for Drosophila melanogaster and the H3K4ac antibody dataset. Finally, we propose a tree-based approach that conducts gene selection and builds a classifier simultaneously, in order to select the minimal number of genes that would reliably predict a given breast cancer subtype. Our results support that this modified approach to gene selection yields a small subset of genes that can predict subtypes with greater than 95%overall accuracy. In addition to providing a valuable list of targets for diagnostic purposes, the gene ontologies of the selected genes suggest that these methods have isolated a number of potential genes involved in breast cancer biology, etiology and potentially novel therapeutics

    Segmentation and intensity estimation for microarray images with saturated pixels

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray image analysis processes scanned digital images of hybridized arrays to produce the input spot-level data for downstream analysis, so it can have a potentially large impact on those and subsequent analysis. Signal saturation is an optical effect that occurs when some pixel values for highly expressed genes or peptides exceed the upper detection threshold of the scanner software (2<sup>16 </sup>- 1 = 65, 535 for 16-bit images). In practice, spots with a sizable number of saturated pixels are often flagged and discarded. Alternatively, the saturated values are used without adjustments for estimating spot intensities. The resulting expression data tend to be biased downwards and can distort high-level analysis that relies on these data. Hence, it is crucial to effectively correct for signal saturation.</p> <p>Results</p> <p>We developed a flexible mixture model-based segmentation and spot intensity estimation procedure that accounts for saturated pixels by incorporating a censored component in the mixture model. As demonstrated with biological data and simulation, our method extends the dynamic range of expression data beyond the saturation threshold and is effective in correcting saturation-induced bias when the lost information is not tremendous. We further illustrate the impact of image processing on downstream classification, showing that the proposed method can increase diagnostic accuracy using data from a lymphoma cancer diagnosis study.</p> <p>Conclusions</p> <p>The presented method adjusts for signal saturation at the segmentation stage that identifies a pixel as part of the foreground, background or other. The cluster membership of a pixel can be altered versus treating saturated values as truly observed. Thus, the resulting spot intensity estimates may be more accurate than those obtained from existing methods that correct for saturation based on already segmented data. As a model-based segmentation method, our procedure is able to identify inner holes, fuzzy edges and blank spots that are common in microarray images. The approach is independent of microarray platform and applicable to both single- and dual-channel microarrays.</p

    Gene Expression Analysis Methods on Microarray Data a A Review

    Get PDF
    In recent years a new type of experiments are changing the way that biologists and other specialists analyze many problems. These are called high throughput experiments and the main difference with those that were performed some years ago is mainly in the quantity of the data obtained from them. Thanks to the technology known generically as microarrays, it is possible to study nowadays in a single experiment the behavior of all the genes of an organism under different conditions. The data generated by these experiments may consist from thousands to millions of variables and they pose many challenges to the scientists who have to analyze them. Many of these are of statistical nature and will be the center of this review. There are many types of microarrays which have been developed to answer different biological questions and some of them will be explained later. For the sake of simplicity we start with the most well known ones: expression microarrays

    Microarray image processing : a novel neural network framework

    Get PDF
    Due to the vast success of bioengineering techniques, a series of large-scale analysis tools has been developed to discover the functional organization of cells. Among them, cDNA microarray has emerged as a powerful technology that enables biologists to cDNA microarray technology has enabled biologists to study thousands of genes simultaneously within an entire organism, and thus obtain a better understanding of the gene interaction and regulation mechanisms involved. Although microarray technology has been developed so as to offer high tolerances, there exists high signal irregularity through the surface of the microarray image. The imperfection in the microarray image generation process causes noises of many types, which contaminate the resulting image. These errors and noises will propagate down through, and can significantly affect, all subsequent processing and analysis. Therefore, to realize the potential of such technology it is crucial to obtain high quality image data that would indeed reflect the underlying biology in the samples. One of the key steps in extracting information from a microarray image is segmentation: identifying which pixels within an image represent which gene. This area of spotted microarray image analysis has received relatively little attention relative to the advances in proceeding analysis stages. But, the lack of advanced image analysis, including the segmentation, results in sub-optimal data being used in all downstream analysis methods. Although there is recently much research on microarray image analysis with many methods have been proposed, some methods produce better results than others. In general, the most effective approaches require considerable run time (processing) power to process an entire image. Furthermore, there has been little progress on developing sufficiently fast yet efficient and effective algorithms the segmentation of the microarray image by using a highly sophisticated framework such as Cellular Neural Networks (CNNs). It is, therefore, the aim of this thesis to investigate and develop novel methods processing microarray images. The goal is to produce results that outperform the currently available approaches in terms of PSNR, k-means and ICC measurements.EThOS - Electronic Theses Online ServiceAleppo University, SyriaGBUnited Kingdo
    corecore