30 research outputs found

    Two methods to estimate protein copy number from drosophila embryo image data

    Get PDF
    Experiments using microscopy which measure gene expression data usually do so indirectly, by recording the intensity of messenger RNA or proteins tagged with fluorescent agents, to produce semi-quantitative data measured by fluorescent intensity. However, quantitative measurements of mRNA or protein concentrations are imperative for developing predictive models of gene regulation networks. In the absence of experimental procedures designed to calibrate the conversion from intensity to concentration, a statistical model of the intensity values may be used to estimate this relationship. In this thesis, two different estimators are developed to estimate the relationship between intensity and protein copy number. The methods were applied to a data set of time-lapse protein expression data taken from embryos of Drosophila melanogaster . Both methods assume a linear relationship between intensity and concentration. When restricted to a specific protein, the methods produce very consistent results, and are in general agreement with other methods applied to similar data. The software used to generate the estimates is implemented as a series of scripts in R. The data is all drawn from FlyEx

    American College of Rheumatology Provisional Criteria for Clinically Relevant Improvement in Children and Adolescents With Childhood-Onset Systemic Lupus Erythematosus

    Get PDF
    10.1002/acr.23834ARTHRITIS CARE & RESEARCH715579-59

    Analyzing Phenotypes in High-content Screening with Machine Learning

    No full text
    High-content screening (HCS) uses computational analysis on large collections of unlabeled biological image data to make discoveries in cell biology. While biological images have historically been analyzed by inspection, advances in the automation of sample preparation and delivery, coupled with advances in microscopy and data storage, have resulted in a massive increase in both the number and resolution of images produced per study. These advances have facilitated genome-scale imaging studies, which are increasingly frequent. Although the sheer volume of data involved strongly favours computational analysis, many assays continue to be scored by eye. As a scoring method, visual inspection limits the rate at which data may be analyzed, at increased cost and decreased reproducibility. In this thesis, we propose computational methods for data analysis of HCS data. We begin with feature data derived from confocal microscopy fluorescence images of yeast cell populations. We use machine learning methods trained on a small labeled subset of that feature data to robustly score each population with respect to a DNA damage focus phenotype. We then introduce a method for using deep autoencoders trained using a label-free objective to perform dimensionality reduction. This allows us to model the non-linear relations between features in high-dimensional data. The computational complexity of our approach scales linearly with the number of examples, allowing us to train on a much larger number of samples. Finally, we propose an outlier detection method for discovering populations that present significantly different distributions of cellular phenotypes as com- pared to wild-type using nonparametric Bayesian clustering on the low-dimensional data. We evaluate our methods against comparable alternatives and show that they either meet or exceed the level of top performers.Ph.D
    corecore