22 research outputs found

    Communicating subcellular distributions.

    No full text
    To build more accurate models of cells and tissues, the ability to incorporate information on the distributions of proteins (and other macromolecules) will become increasingly important. This review describes current progress towards determining and representing protein subcellular patterns so that the information can be used as part of systems biology efforts. Approaches to decomposing an image of the subcellular pattern of a protein give critical information about the fraction of that protein in each of a number of fundamental patterns (e.g., organelles). Methods for learning generative models from images provide a means of capturing the essential properties and variation in those properties of cell shape and organelle patterns. The combination of models of fundamental patterns and vectors specifying the fraction of a protein in each of them provide a much better means of communicating subcellular patterns than the descriptive terms that are currently used. Communicating information about subcellular patterns is important not only for systems biology simulations but also for representing results from microscopy experiments, including high content screening and imaging flow cytometry, in a transportable and generalizable manner.</p

    A new era in bioimage informatics.

    No full text
    <p>Bioimage informatics arose from efforts to automate pathology and cytology tasks (<a href="http://bioinformatics.oxfordjournals.org/content/30/10/1353.long#ref-4">Eaves, 1967</a>). With few exceptions, much of the software developed during these early days, whether in academic or commercial institutions, was proprietary. The primary paradigm was production of hand-tuned engineered systems that could reproduce human performance, and visualization was emphasized for interpreting results or providing assistance to clinicians (<a href="http://bioinformatics.oxfordjournals.org/content/30/10/1353.long#ref-1">Bartels and Wied, 1977;</a> <a href="http://bioinformatics.oxfordjournals.org/content/30/10/1353.long#ref-7">Kaman, et al., 1984;</a> <a href="http://bioinformatics.oxfordjournals.org/content/30/10/1353.long#ref-13">van Driel-Kulker and Ploem, 1982</a>). The computational resources available at the time were frequently limiting. Essentially, no successful commercial systems came from these efforts for many years, until the US Food and Drug Administration’s approval of automated Pap smear analysis in the mid 1990s (<a href="http://bioinformatics.oxfordjournals.org/content/30/10/1353.long#ref-10">Patten <em>et al.</em>, 1996</a>).</p

    CellOrganizer: Image-derived models of subcellular organization and protein distribution.

    No full text
    <p>This chapter describes approaches for learning models of subcellular organization from images. The primary utility of these models is expected to be from incorporation into complex simulations of cell behaviors. Most current cell simulations do not consider spatial organization of proteins at all, or treat each organelle type as a single, idealized compartment. The ability to build generative models for all proteins in a proteome and use them for spatially accurate simulations is expected to improve the accuracy of models of cell behaviors. A second use, of potentially equal importance, is expected to be in testing and comparing software for analyzing cell images. The complexity and sophistication of algorithms used in cell-image-based screens and assays (variously referred to as high-content screening, high-content analysis, or high-throughput microscopy) is continuously increasing, and generative models can be used to produce images for testing these algorithms in which the expected answer is known.</p

    An active role for machine learning in drug development.

    No full text
    <p>Because of the complexity of biological systems, cutting-edge machine-learning methods will be critical for future drug development. In particular, machine-vision methods to extract detailed information from imaging assays and active-learning methods to guide experimentation will be required to overcome the dimensionality problem in drug development.</p

    A generative model of microtubule distributions, and indirect estimation of its parameters from fluorescence microscopy images.

    No full text
    The microtubule network plays critical roles in many cellular processes, and quantitative models of how its organization varies across cell types and conditions are required for understanding those roles and as input to cell simulations. High-throughput image acquisition technologies are potentially valuable for this purpose, but do not provide sufficient resolution for current analysis methods that rely on tracing of individual microtubules. We describe a parametric conditional model of microtubule distribution that can generate a microtubule network in intact cells using a persistent random walk approach. The model parameters are physically meaningful as they directly describe the spatial distribution of microtubules and include the number of microtubules as well as the mean of the length distribution. We also present an indirect method for estimating the parameters of the model from three-dimensional fluorescence microscope images of cells that relies on comparing acquired images with simulated images generated from the model. Our results show that our method can reasonably recover parameters for a given query image, and we present the distributions of parameters estimated by our method for a collection of HeLa cell images. (c) 2010 International Society for Advancement of Cytometry.</p

    Automated estimation of microtubule model parameters from 3-D live cell microscopy images

    No full text
    While basic principles of microtubule organization are well understood, much remains to be learned about the extent and significance of variation in that organization among cell types and conditions. Large numbers of images of microtubule distributions for many cell types can be readily obtained by high throughput fluorescence microscopy but direct estimation of the parameters underlying the organization is problematic because it is difficult to resolve individual microtubules present at the microtubule-organizing center or at regions of high crossover. Previously, we developed an indirect, generative model-based approach that can estimate such spatial distribution parameters as the number and mean length of microtubules. In order to validate this approach, we have applied it to 3D images of NIH 3T3 cells expressing fluorescently-tagged tubulin in the presence and absence of the microtubule depolymerizing drug nocodazole. We describe here the first application of our inverse modeling approach to live cell images and demonstrate that it yields estimates consistent with expectations.</p

    Quantifying the distribution of probes between subcellular locations using unsupervised pattern unmixing.

    No full text
    MOTIVATION: Proteins exhibit complex subcellular distributions, which may include localizing in more than one organelle and varying in location depending on the cell physiology. Estimating the amount of protein distributed in each subcellular location is essential for quantitative understanding and modeling of protein dynamics and how they affect cell behaviors. We have previously described automated methods using fluorescent microscope images to determine the fractions of protein fluorescence in various subcellular locations when the basic locations in which a protein can be present are known. As this set of basic locations may be unknown (especially for studies on a proteome-wide scale), we here describe unsupervised methods to identify the fundamental patterns from images of mixed patterns and estimate the fractional composition of them. METHODS: We developed two approaches to the problem, both based on identifying types of objects present in images and representing patterns by frequencies of those object types. One is a basis pursuit method (which is based on a linear mixture model), and the other is based on latent Dirichlet allocation (LDA). For testing both approaches, we used images previously acquired for testing supervised unmixing methods. These images were of cells labeled with various combinations of two organelle-specific probes that had the same fluorescent properties to simulate mixed patterns of subcellular location. RESULTS: We achieved 0.80 and 0.91 correlation between estimated and underlying fractions of the two probes (fundamental patterns) with basis pursuit and LDA approaches, respectively, indicating that our methods can unmix the complex subcellular distribution with reasonably high accuracy. AVAILABILITY: http://murphylab.web.cmu.edu/software.</p

    Discriminative motif finding for predicting protein subcellular localization.

    No full text
    <p>Many methods have been described to predict the subcellular location of proteins from sequence information. However, most of these methods either rely on global sequence properties or use a set of known protein targeting motifs to predict protein localization. Here, we develop and test a novel method that identifies potential targeting motifs using a discriminative approach based on hidden Markov models (discriminative HMMs). These models search for motifs that are present in a compartment but absent in other, nearby, compartments by utilizing an hierarchical structure that mimics the protein sorting mechanism. We show that both discriminative motif finding and the hierarchical structure improve localization prediction on a benchmark data set of yeast proteins. The motifs identified can be mapped to known targeting motifs and they are more conserved than the average protein sequence. Using our motif-based predictions, we can identify potential annotation errors in public databases for the location of some of the proteins. A software implementation and the data set described in this paper are available from http://murphylab.web.cmu.edu/software/2009_TCBB_motif/.</p

    Deciding when to stop: efficient experimentation to learn to predict drug-target interactions.

    No full text
    <p>BACKGROUND: Active learning is a powerful tool for guiding an experimentation process. Instead of doing all possible experiments in a given domain, active learning can be used to pick the experiments that will add the most knowledge to the current model. Especially, for drug discovery and development, active learning has been shown to reduce the number of experiments needed to obtain high-confidence predictions. However, in practice, it is crucial to have a method to evaluate the quality of the current predictions and decide when to stop the experimentation process. Only by applying reliable stopping criteria to active learning can time and costs in the experimental process actually be saved.</p> <p>RESULTS: We compute active learning traces on simulated drug-target matrices in order to determine a regression model for the accuracy of the active learner. By analyzing the performance of the regression model on simulated data, we design stopping criteria for previously unseen experimental matrices. We demonstrate on four previously characterized drug effect data sets that applying the stopping criteria can result in upto 40 % savings of the total experiments for highly accurate predictions.</p> <p>CONCLUSIONS: We show that active learning accuracy can be predicted using simulated data and results in substantial savings in the number of experiments required to make accurate drug-target predictions.</p

    A Graphical Model to Determine the Subcellular Protein Location in Artificial Tissues.

    No full text
    <p>Location proteomics is concerned with the systematic analysis of the subcellular location of proteins. In order to perform comprehensive analysis of all protein location patterns, automated methods are needed. With the goal of extending automated subcellular location pattern analysis methods to high resolution images of tissues, 3D confocal microscope images of polarized CaCo2 cells immunostained for various proteins were collected. A three-color staining protocol was developed that permits parallel imaging of proteins of interest as well as DNA and the actin cytoskeleton. The collection is composed of 11 to 21 images for each of the 9 proteins that depict major subcellular patterns. A classifier was trained to recognize the subcellular location pattern of segmented cells with an accuracy of 89.2%. Using the Prior Updating method allowed improvement of this accuracy to 99.6%. This study demonstrates the benefit of using a graphical model approach for improving the pattern classification in tissue images.</p
    corecore