1,576 research outputs found

    Accurate detection of dysmorphic nuclei using dynamic programming and supervised classification

    Get PDF
    A vast array of pathologies is typified by the presence of nuclei with an abnormal morphology. Dysmorphic nuclear phenotypes feature dramatic size changes or foldings, but also entail much subtler deviations such as nuclear protrusions called blebs. Due to their unpredictable size, shape and intensity, dysmorphic nuclei are often not accurately detected in standard image analysis routines. To enable accurate detection of dysmorphic nuclei in confocal and widefield fluorescence microscopy images, we have developed an automated segmentation algorithm, called Blebbed Nuclei Detector (BleND), which relies on two-pass thresholding for initial nuclear contour detection, and an optimal path finding algorithm, based on dynamic programming, for refining these contours. Using a robust error metric, we show that our method matches manual segmentation in terms of precision and outperforms state-of-the-art nuclear segmentation methods. Its high performance allowed for building and integrating a robust classifier that recognizes dysmorphic nuclei with an accuracy above 95%. The combined segmentation-classification routine is bound to facilitate nucleus-based diagnostics and enable real-time recognition of dysmorphic nuclei in intelligent microscopy workflows

    Essential guidelines for computational method benchmarking

    Get PDF
    In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology

    Essential guidelines for computational method benchmarking

    Get PDF
    In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology.Comment: Minor update

    Charting the single-cell landscape of colorectal cancer stem cell polarisation

    Get PDF
    Colonic epithelia is regulated by cell-intrinsic and cell-extrinsic cues, both in homeostatic tissues and colorectal cancer (CRC), where the tumour microenvironment closely interacts with mutated epithelia. Our understanding on how these cues polarise colonic stem cell (CSC) states remains incomplete. Indeed, charting the interaction between intrinsic and stromal cues requires a systematic study yet to be found in the literature. In this work I present my efforts towards computationally studying colonic stem cell polarisation at single-cell resolution. Leveraging the scalability of organoid models, my colleagues and I dissected the heterocellular CRC organoid system presented in Qin & Cardoso Rodriguez et al. using single-cell omic analyses, resolving complex interaction and polarisation processes. First, I identified bottlenecks in common mass cytometry (MC) analysis workflows benefiting from either increased accessibility or automation; designing the CyGNAL pipeline and developing a cell-state classifier to tackle these points respectively. I then used single-cell RNA sequencing (scRNA-seq) data to reveal a shared landscape of CSC polarisation; wherein stromal cues polarise the epithelia towards slow-cycling revival CSC (revCSC) and oncogenic mutations trap cells in a hyper-proliferative CSC (proCSC) state. I then developed a method to visualise single-cell differentation using a novel valley-ridge (VR) score, which can generate data-driven Waddington-like landscapes that recapitulate differentiation dynamics of the colonic epithelia. Finally, I explored an approach for holistic inter- and intra-cellular communication analysis by incorporating literature information as a directed knowledge graph (KG), showing that low-dimensional representations of the graph retain biological information and that projected cellular profiles recapitulate their transcriptomes. These results reveal a polarisation landscape where CRC epithelia is trapped in a proCSC state refractory to stromal cues, and broadly show the importance of joint collaborative wet- and dry-lab work; central towards targeting gaps in the method space and generating a comprehensive analysis of heterocellular signalling in cancer

    Inferring Phenotypic Properties from Single-Cell Characteristics

    Get PDF
    Flow cytometry provides multi-dimensional data at the single-cell level. Such data contain information about the cellular heterogeneity of bulk samples, making it possible to correlate single-cell features with phenotypic properties of bulk tissues. Predicting phenotypes from single-cell measurements is a difficult challenge that has not been extensively studied. The 6th Dialogue for Reverse Engineering Assessments and Methods (DREAM6) invited the research community to develop solutions to a computational challenge: classifying acute myeloid leukemia (AML) positive patients and healthy donors using flow cytometry data. DREAM6 provided flow cytometry data for 359 normal and AML samples, and the class labels for half of the samples. Researchers were asked to predict the class labels of the remaining half. This paper describes one solution that was constructed by combining three algorithms: spanning-tree progression analysis of density-normalized events (SPADE), earth mover’s distance, and a nearest-neighbor classifier called Relief. This solution was among the top-performing methods that achieved 100% prediction accuracy

    Structural exploration and inference of the network

    Get PDF
    This dissertation consists of two parts. In the first part, a learning-based method for classification of online reviews that achieves better classification accuracy is extended. Automatic sentiment classification is becoming a popular and effective way to help online users or companies to process and make sense of customer reviews. The method combines two recent developments. First, valence shifters and individual opinion words are combined as bigrams to use in an ordinal margin classifier. Second, relational information between unigrams expressed in the form of a graph is used to constrain the parameters of the classifier. By combining these two components, it is possible to extract more of the unstructured information present in the data than previous methods, like support vector machine, random forest, hence gaining the potential of better performance. Indeed, the results show a higher classification accuracy on empirical real data with ground truth as well as on simulated data. The second part deals with graphical models. Gaussian graphical models are useful to explore conditional dependence relationships between random variables through estimation of the inverse covariance matrix of a multivariate normal distribution. An estimator for such models appropriate for multiple graphs analysis in two groups is developed. Under this setting, inferring networks separately ignores the common structure, while inferring networks identically would mask the disparity. A generalized method which estimates multiple partial correlation matrices through linear regressions is proposed. The method pursues the sparsity for each matrix, similarities for matrices within each group, and the disparities for matrices between groups. This is achieved by a l1 penalty and a 12 penalty for the pursuit of sparseness and clustering, and a metric that learns the true heterogeneity through optimization procedure. Theoretically, the asymptotic consistency for both constrained l0 method and the proposed method to reconstruct the structures is shown. Its superior performance is illustrated via a number of simulated networks. An application to polychromatic flow cytometry data sets for network inference under different sets of conditions is also included
    • …
    corecore