77 research outputs found

    Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The recent emergence of high-throughput automated image acquisition technologies has forever changed how cell biologists collect and analyze data. Historically, the interpretation of cellular phenotypes in different experimental conditions has been dependent upon the expert opinions of well-trained biologists. Such qualitative analysis is particularly effective in detecting subtle, but important, deviations in phenotypes. However, while the rapid and continuing development of automated microscope-based technologies now facilitates the acquisition of trillions of cells in thousands of diverse experimental conditions, such as in the context of RNA interference (RNAi) or small-molecule screens, the massive size of these datasets precludes human analysis. Thus, the development of automated methods which aim to identify novel and biological relevant phenotypes online is one of the major challenges in high-throughput image-based screening. Ideally, phenotype discovery methods should be designed to utilize prior/existing information and tackle three challenging tasks, i.e. restoring pre-defined biological meaningful phenotypes, differentiating novel phenotypes from known ones and clarifying novel phenotypes from each other. Arbitrarily extracted information causes biased analysis, while combining the complete existing datasets with each new image is intractable in high-throughput screens.</p> <p>Results</p> <p>Here we present the design and implementation of a novel and robust online phenotype discovery method with broad applicability that can be used in diverse experimental contexts, especially high-throughput RNAi screens. This method features phenotype modelling and iterative cluster merging using improved gap statistics. A Gaussian Mixture Model (GMM) is employed to estimate the distribution of each existing phenotype, and then used as reference distribution in gap statistics. This method is broadly applicable to a number of different types of image-based datasets derived from a wide spectrum of experimental conditions and is suitable to adaptively process new images which are continuously added to existing datasets. Validations were carried out on different dataset, including published RNAi screening using <it>Drosophila </it>embryos [Additional files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>], dataset for cell cycle phase identification using HeLa cells [Additional files <supplr sid="S1">1</supplr>, <supplr sid="S3">3</supplr>, <supplr sid="S4">4</supplr>] and synthetic dataset using polygons, our methods tackled three aforementioned tasks effectively with an accuracy range of 85%–90%. When our method is implemented in the context of a <it>Drosophila </it>genome-scale RNAi image-based screening of cultured cells aimed to identifying the contribution of individual genes towards the regulation of cell-shape, it efficiently discovers meaningful new phenotypes and provides novel biological insight. We also propose a two-step procedure to modify the novelty detection method based on one-class SVM, so that it can be used to online phenotype discovery. In different conditions, we compared the SVM based method with our method using various datasets and our methods consistently outperformed SVM based method in at least two of three tasks by 2% to 5%. These results demonstrate that our methods can be used to better identify novel phenotypes in image-based datasets from a wide range of conditions and organisms.</p> <p>Conclusion</p> <p>We demonstrate that our method can detect various novel phenotypes effectively in complex datasets. Experiment results also validate that our method performs consistently under different order of image input, variation of starting conditions including the number and composition of existing phenotypes, and dataset from different screens. In our findings, the proposed method is suitable for online phenotype discovery in diverse high-throughput image-based genetic and chemical screens.</p

    Time-Resolved Quantification of Centrosomes by Automated Image Analysis Suggests Limiting Component to Set Centrosome Size in C. Elegans Embryos

    Get PDF
    The centrosome is a dynamic organelle found in all animal cells that serves as a microtubule organizing center during cell division. Most of the centrosome components have been identified by genetic screens over the last decade, but little is known about how these components interact with each other to form a functional centrosome. Towards a better understanding of the molecular organization of the centrosome, we investigated the mechanism that regulates the size of the centrosome in the early C. elegans embryo. For this, we monitored fluorescently labeled centrosomes in living embryos and developed a suite of image analysis algorithms to quantify the centrosomes in the resulting 3D time-lapse images. In particular, we developed a novel algorithm involving a two-stage linking process for tracking entrosomes, which is a multi-object tracking task. This fully automated analysis pipeline enabled us to acquire time-resolved data of centrosome growth in a large number of embryos and could detect subtle phenotypes that were missed by previous assays based on manual image analysis. In a first set of experiments, we quantified centrosome size over development in wild-type embryos and made three essential observations. First, centrosome volume scales proportionately with cell volume. Second, beginning at the 4-cell stage, when cells are small, centrosome size plateaus during the cell cycle. Third, the total centrosome volume the embryo gives rise to in any one cell stage is approximately constant. Based on our observations, we propose a ‘limiting component’ model in which centrosome size is limited by the amounts of maternally derived centrosome components. In a second set of experiments, we tested our hypothesis by varying cell size, centrosome number and microtubule-mediated pulling forces. We then manipulated the amounts of several centrosomal proteins and found that the conserved centriolar and pericentriolar material protein SPD-2 is one such component that determines centrosome size

    Novel image markers for non-small cell lung cancer classification and survival prediction

    Get PDF
    BACKGROUND: Non-small cell lung cancer (NSCLC), the most common type of lung cancer, is one of serious diseases causing death for both men and women. Computer-aided diagnosis and survival prediction of NSCLC, is of great importance in providing assistance to diagnosis and personalize therapy planning for lung cancer patients. RESULTS: In this paper we have proposed an integrated framework for NSCLC computer-aided diagnosis and survival analysis using novel image markers. The entire biomedical imaging informatics framework consists of cell detection, segmentation, classification, discovery of image markers, and survival analysis. A robust seed detection-guided cell segmentation algorithm is proposed to accurately segment each individual cell in digital images. Based on cell segmentation results, a set of extensive cellular morphological features are extracted using efficient feature descriptors. Next, eight different classification techniques that can handle high-dimensional data have been evaluated and then compared for computer-aided diagnosis. The results show that the random forest and adaboost offer the best classification performance for NSCLC. Finally, a Cox proportional hazards model is fitted by component-wise likelihood based boosting. Significant image markers have been discovered using the bootstrap analysis and the survival prediction performance of the model is also evaluated. CONCLUSIONS: The proposed model have been applied to a lung cancer dataset that contains 122 cases with complete clinical information. The classification performance exhibits high correlations between the discovered image markers and the subtypes of NSCLC. The survival analysis demonstrates strong prediction power of the statistical model built from the discovered image markers

    Novel Image Markers for Non-Small Cell Lung Cancer Classification and Survival Prediction

    Get PDF
    BACKGROUND: Non-small cell lung cancer (NSCLC), the most common type of lung cancer, is one of serious diseases causing death for both men and women. Computer-aided diagnosis and survival prediction of NSCLC, is of great importance in providing assistance to diagnosis and personalize therapy planning for lung cancer patients. RESULTS: In this paper we have proposed an integrated framework for NSCLC computer-aided diagnosis and survival analysis using novel image markers. The entire biomedical imaging informatics framework consists of cell detection, segmentation, classification, discovery of image markers, and survival analysis. A robust seed detection-guided cell segmentation algorithm is proposed to accurately segment each individual cell in digital images. Based on cell segmentation results, a set of extensive cellular morphological features are extracted using efficient feature descriptors. Next, eight different classification techniques that can handle high-dimensional data have been evaluated and then compared for computer-aided diagnosis. The results show that the random forest and adaboost offer the best classification performance for NSCLC. Finally, a Cox proportional hazards model is fitted by component-wise likelihood based boosting. Significant image markers have been discovered using the bootstrap analysis and the survival prediction performance of the model is also evaluated. CONCLUSIONS: The proposed model have been applied to a lung cancer dataset that contains 122 cases with complete clinical information. The classification performance exhibits high correlations between the discovered image markers and the subtypes of NSCLC. The survival analysis demonstrates strong prediction power of the statistical model built from the discovered image markers

    Advances in quantitative microscopy

    Get PDF
    Microscopy allows us to peer into the complex deeply shrouded world that the cells of our body grow and thrive in. With the emergence of automated digital microscopes and software for anlysing and processing the large numbers of image that they produce; quantitative microscopy approaches are now allowing us to answer ever larger and more complex biological questions. In this thesis I explore two trends. Firstly, that of using quantitative microscopy for performing unbiased screens, the advances made here include developing strategies to handle imaging data captured from physiological models, and unsupervised analysis screening data to derive unbiased biological insights. Secondly, I develop software for analysing live cell imaging data, that can now be captured at greater rates than ever before and use this to help answer key questions covering the biology of how cells make the decision to arrest or proliferate in response to DNA damage. Together this thesis represents a view of the current state of the art in high-throughput quantitative microscopy and details where the field is heading as machine learning approaches become ever more sophisticated.Open Acces

    ZBIORY POZIOMICOWE I ALGORYTMY INTELIGENCJI OBLICZENIOWEJ DO ANALIZY OBRAZÓW MEDYCZNYCH W SYSTEMIE E-MEDICUS

    Get PDF
    In this work, there were implemented methods to analyze and segmentation medical images by using topological, statistical algorithms and artificial intelligence techniques. The solution shows the architecture of the system collecting and analyzing data. There was tried to develop an algorithm for level set method (LSM) applied to piecewise constant image segmentation. These algorithms are needed to identify arbitrary number of phases for the segmentation problem. The image segmentation refers to the process of partitioning a digital image into multiple regions. There is typically used to locate objects and boundaries in images. There was also shown an algorithm for analyzing medical images using a neural network MLP.W artykule zostały zaimplementowane metody do analizy i segmentacji obrazów medycznych przy użyciu algorytmów topologicznych, statystycznych i technik sztucznej inteligencji. Rozwiązanie przedstawia architekturę systemu do gromadzenia i analizy danych. Opracowano algorytmy oparte na metodzie zbiorów poziomicowych (MZP) jako odcinkowo stałą segmentację obrazu. Algorytmy te są potrzebne do identyfikacji dowolnej liczby faz dla problemu segmentacji, która odnosi się do procesu dzielenia cyfrowego obrazu w różnych regionach. Metoda używana jest zwykle do lokalizacji obiektów i brzegów w obrazach. W pracy przedstawiono również algorytm do analizy obrazów medycznych z wykorzystaniem sieci neuronowej MLP

    Automatic Segmentation of Cells of Different Types in Fluorescence Microscopy Images

    Get PDF
    Recognition of different cell compartments, types of cells, and their interactions is a critical aspect of quantitative cell biology. This provides a valuable insight for understanding cellular and subcellular interactions and mechanisms of biological processes, such as cancer cell dissemination, organ development and wound healing. Quantitative analysis of cell images is also the mainstay of numerous clinical diagnostic and grading procedures, for example in cancer, immunological, infectious, heart and lung disease. Computer automation of cellular biological samples quantification requires segmenting different cellular and sub-cellular structures in microscopy images. However, automating this problem has proven to be non-trivial, and requires solving multi-class image segmentation tasks that are challenging owing to the high similarity of objects from different classes and irregularly shaped structures. This thesis focuses on the development and application of probabilistic graphical models to multi-class cell segmentation. Graphical models can improve the segmentation accuracy by their ability to exploit prior knowledge and model inter-class dependencies. Directed acyclic graphs, such as trees have been widely used to model top-down statistical dependencies as a prior for improved image segmentation. However, using trees, a few inter-class constraints can be captured. To overcome this limitation, polytree graphical models are proposed in this thesis that capture label proximity relations more naturally compared to tree-based approaches. Polytrees can effectively impose the prior knowledge on the inclusion of different classes by capturing both same-level and across-level dependencies. A novel recursive mechanism based on two-pass message passing is developed to efficiently calculate closed form posteriors of graph nodes on polytrees. Furthermore, since an accurate and sufficiently large ground truth is not always available for training segmentation algorithms, a weakly supervised framework is developed to employ polytrees for multi-class segmentation that reduces the need for training with the aid of modeling the prior knowledge during segmentation. Generating a hierarchical graph for the superpixels in the image, labels of nodes are inferred through a novel efficient message-passing algorithm and the model parameters are optimized with Expectation Maximization (EM). Results of evaluation on the segmentation of simulated data and multiple publicly available fluorescence microscopy datasets indicate the outperformance of the proposed method compared to state-of-the-art. The proposed method has also been assessed in predicting the possible segmentation error and has been shown to outperform trees. This can pave the way to calculate uncertainty measures on the resulting segmentation and guide subsequent segmentation refinement, which can be useful in the development of an interactive segmentation framework
    • …
    corecore