332 research outputs found
Automated microscopy for high-content RNAi screening
Fluorescence microscopy is one of the most powerful tools to investigate complex cellular processes such as cell division, cell motility, or intracellular trafficking. The availability of RNA interference (RNAi) technology and automated microscopy has opened the possibility to perform cellular imaging in functional genomics and other large-scale applications. Although imaging often dramatically increases the content of a screening assay, it poses new challenges to achieve accurate quantitative annotation and therefore needs to be carefully adjusted to the specific needs of individual screening applications. In this review, we discuss principles of assay design, large-scale RNAi, microscope automation, and computational data analysis. We highlight strategies for imaging-based RNAi screening adapted to different library and assay designs
CellCognition : time-resolved phenotype annotation in high-throughput live cell imaging
Author Posting. © The Authors, 2010. This is the author's version of the work. It is posted here by permission of Nature Publishing Group for personal use, not for redistribution. The definitive version was published in Nature Methods 7 (2010): 747-754, doi:10.1038/nmeth.1486.Fluorescence time-lapse imaging has become a powerful tool to investigate complex
dynamic processes such as cell division or intracellular trafficking. Automated
microscopes generate time-resolved imaging data at high throughput, yet tools for
quantification of large-scale movie data are largely missing. Here, we present
CellCognition, a computational framework to annotate complex cellular dynamics.
We developed a machine learning method that combines state-of-the-art classification
with hidden Markov modeling for annotation of the progression through
morphologically distinct biological states. The incorporation of time information into
the annotation scheme was essential to suppress classification noise at state
transitions, and confusion between different functional states with similar
morphology. We demonstrate generic applicability in a set of different assays and
perturbation conditions, including a candidate-based RNAi screen for mitotic exit
regulators in human cells. CellCognition is published as open source software,
enabling live imaging-based screening with assays that directly score cellular
dynamics.Work in the Gerlich
laboratory is supported by Swiss National Science Foundation (SNF) research grant
3100A0-114120, SNF ProDoc grant PDFMP3_124904, a European Young
Investigator (EURYI) award of the European Science Foundation, an EMBO YIP
fellowship, and a MBL Summer Research Fellowship to D.W.G., an ETH TH grant, a
grant by the UBS foundation, a Roche Ph.D. fellowship to M.H.A.S, and a Mueller
fellowship of the Molecular Life Sciences Ph.D. program Zurich to M.H. M.H. and
M.H.A.S are fellows of the Zurich Ph.D. Program in Molecular Life Sciences. B.F.
was supported by European Commission’s seventh framework program project
Cancer Pathways. Work in the Ellenberg laboratory is supported by a European
Commission grant within the Mitocheck consortium (LSHG-CT-2004-503464). Work
in the Peter laboratory is supported by the ETHZ, Oncosuisse, SystemsX.ch (LiverX)
and the SNF
Active Learning Strategies for Phenotypic Profiling of High-Content Screens
Abstract
High-content screening is a powerful method to discover new drugs and carry out basic biological research. Increasingly,
high-content screens have come to rely on supervised machine learning (SML) to perform automatic phenotypic
classification as an essential step of the analysis. However, this comes at a cost, namely, the labeled examples required to
train the predictive model. Classification performance increases with the number of labeled examples, and because labeling
examples demands time from an expert, the training process represents a significant time investment. Active learning
strategies attempt to overcome this bottleneck by presenting the most relevant examples to the annotator, thereby
achieving high accuracy while minimizing the cost of obtaining labeled data. In this article, we investigate the impact of active
learning on single-cell–based phenotype recognition, using data from three large-scale RNA interference high-content
screens representing diverse phenotypic profiling problems. We consider several combinations of active learning strategies
and popular SML methods. Our results show that active learning significantly reduces the time cost and can be used to
reveal the same phenotypic targets identified using SML. We also identify combinations of active learning strategies and
SML methods which perform better than others on the phenotypic profiling problems we studied
Enhanced CellClassifier: a multi-class classification tool for microscopy images
BACKGROUND: Light microscopy is of central importance in cell biology. The recent introduction of automated high content screening has expanded this technology towards automation of experiments and performing large scale perturbation assays. Nevertheless, evaluation of microscopy data continues to be a bottleneck in many projects. Currently, among open source software, CellProfiler and its extension Analyst are widely used in automated image processing. Even though revolutionizing image analysis in current biology, some routine and many advanced tasks are either not supported or require programming skills of the researcher. This represents a significant obstacle in many biology laboratories. RESULTS: We have developed a tool, Enhanced CellClassifier, which circumvents this obstacle. Enhanced CellClassifier starts from images analyzed by CellProfiler, and allows multi-class classification using a Support Vector Machine algorithm. Training of objects can be done by clicking directly "on the microscopy image" in several intuitive training modes. Many routine tasks like out-of focus exclusion and well summary are also supported. Classification results can be integrated with other object measurements including inter-object relationships. This makes a detailed interpretation of the image possible, allowing the differentiation of many complex phenotypes. For the generation of the output, image, well and plate data are dynamically extracted and summarized. The output can be generated as graphs, Excel-files, images with projections of the final analysis and exported as variables. CONCLUSION: Here we describe Enhanced CellClassifier which allows multiple class classification, elucidating complex phenotypes. Our tool is designed for the biologist who wants both, simple and flexible analysis of images without requiring programming skills. This should facilitate the implementation of automated high-content screening
Data-analysis strategies for image-based cell profiling
Image-based cell profiling is a high-throughput strategy for the quantification of phenotypic differences among a variety of cell populations. It paves the way to studying biological systems on a large scale by using chemical and genetic perturbations. The general workflow for this technology involves image acquisition with high-throughput microscopy systems and subsequent image processing and analysis. Here, we introduce the steps required to create high-quality image-based (i.e., morphological) profiles from a collection of microscopy images. We recommend techniques that have proven useful in each stage of the data analysis process, on the basis of the experience of 20 laboratories worldwide that are refining their image-based cell-profiling methodologies in pursuit of biological discovery. The recommended techniques cover alternatives that may suit various biological goals, experimental designs, and laboratories' preferences.Peer reviewe
Computational Analysis of RNAi Screening Data to Identify Host Factors Involved in Viral Infection and to Characterize Protein-Protein Interactions
The study of gene functions in a variety of different treatments, cell lines and organisms has been facilitated by RNA interference (RNAi) technology that tracks the phenotype of cells after silencing of particular genes. In this thesis, I describe two computational approaches developed to analyze the image data from two different RNAi screens. Firstly, I developed an alternative approach to detect host factors (human proteins) that support virus growth and replication of cells infected with the Hepatitis C virus (HCV). To identify the human proteins that are crucial for the efficiency of viral infection, several RNAi experiments of viral-infected cells have been conducted. However, the target lists from different laboratories have shown only little overlap. This inconsistency might be caused not only by experimental discrepancies, but also by not fully explored possibilities of the data analysis. Observing only viral intensity readouts from the experiments might be insufficient. In this project, I describe our computational development as a new alternative approach to improve the reliability for the host factor identification. Our approach is based on characterizing the clustering of infected cells. The idea is that viral infection is spread by cell-cell contacts, or at least advantaged by the vicinity of cells. Therefore, clustering of the HCV infected cells is observed during spreading of the infection. We developed a clustering detection method basing on a distance-based point pattern analysis (K-function) to identify knockdown genes in which the clusters of HCV infected cells were reduced. The approach could significantly separate between positive and negative controls and found good correlations between the clustering score and intensity readouts from the experimental screens. In comparison to another clustering algorithm, the K-function method was superior to Quadrat analysis method. Statistical normalization approaches were exploited to identify protein targets from our clustering-based approach and the experimental screens. Integrating results from our clustering method, intensity readout analysis and secondary screen, we finally identified five promising host factors that are suitable candidate targets for drug therapy. Secondly, a machine learning based approach was developed to characterize protein-protein interactions (PPIs) in a signaling network. The characterization of each PPI is fundamental to our understanding of the complex signaling system of a human cell. Experiments for PPI identification, such as yeast two-hybrid and FRET analysis, are resource-intensive, and, therefore, computational approaches for analysing large-scale RNAi knockdown screens have become an important pursuit of inferring the functional similarities from the phenotypic similarities of the down-regulated proteins. However, these methods did not provide a more detailed characterization of the PPIs. In this project, I developed a new computational approach that is based on a machine learning technique which employs the mitotic phenotypes of an RNAi screen. It enables the identification of the nature of a PPI, i.e., if it is of rather activating or inhibiting nature. We established a systematic classification using Support Vector Machines (SVMs) that was based on the phenotypic descriptors and used it to classify the interactions that activate or inhibit signal transduction. The machines yielded promising results with good performance when integrating different sets of published descriptors and our own developed descriptors calculated from fractions of specific phenotypes, linear classification of phenotypes, and phenotypic distance to distinct proteins. A comprehensive model generated from the machines was used for further predictions. We investigated the nature of pairs of interacting proteins and generated a consistency score that enhanced the precisions of the classification results. We predicted the activating/inhibiting nature for 214 PPIs with high confidence in signaling pathways and enabled to identify a new subgroup of chemokine receptors. These findings might facilitate an enhanced understanding of the cellular mechanisms during inflammation and immunologic responses. In summary, two computational approaches were developed to analyze the image data of the different RNAi screens: 1) a clustering-based approach was used to identify the host factors that are crucial for HCV infection; and 2) a machine learning-based approach with various descriptors was employed to characterize PPI activities. The results from the host factor analysis revealed novel target proteins that are involved in the spread of the HCV. In addition, the results of the characterization of the PPIs lead to a better understanding of the signaling pathways. The two large-scale RNAi data were successfully analyzed by our established approaches to obtain new insights into virus biology and cellular signaling
- …