8,413 research outputs found

    Approximate learning of high dimensional Bayesian network structures via pruning of Candidate Parent Sets.

    Get PDF
    Score-based algorithms that learn Bayesian Network (BN) structures provide solutions ranging from different levels of approximate learning to exact learning. Approximate solutions exist because exact learning is generally not applicable to networks of moderate or higher complexity. In general, approximate solutions tend to sacrifice accuracy for speed, where the aim is to minimise the loss in accuracy and maximise the gain in speed. While some approximate algorithms are optimised to handle thousands of variables, these algorithms may still be unable to learn such high dimensional structures. Some of the most efficient score-based algorithms cast the structure learning problem as a combinatorial optimisation of candidate parent sets. This paper explores a strategy towards pruning the size of candidate parent sets, aimed at high dimensionality problems. The results illustrate how different levels of pruning affect the learning speed relative to the loss in accuracy in terms of model fitting, and show that aggressive pruning may be required to produce approximate solutions for high complexity problems

    Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data.

    Get PDF
    Numerous Bayesian Network (BN) structure learning algorithms have been proposed in the literature over the past few decades. Each publication makes an empirical or theoretical case for the algorithm proposed in that publication and results across studies are often inconsistent in their claims about which algorithm is ‘best’. This is partly because there is no agreed evaluation approach to determine their effectiveness. Moreover, each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This paper investigates the performance of 15 state-of-the-art, well-established, or recent promising structure learning algorithms. We propose a methodology that applies the algorithms to data that incorporates synthetic noise, in an effort to better understand the performance of structure learning algorithms when applied to real data. Each algorithm is tested over multiple case studies, sample sizes, types of noise, and assessed with multiple evaluation criteria. This work involved learning approximately 10,000 graphs with a total structure learning runtime of seven months. In investigating the impact of data noise, we provide the first large scale empirical comparison of BN structure learning algorithms under different assumptions of data noise. The results suggest that traditional synthetic performance may overestimate real-world performance by anywhere between 10% and more than 50%. They also show that while score-based learning is generally superior to constraint-based learning, a higher fitting score does not necessarily imply a more accurate causal graph. The comparisons extend to other outcomes of interest, such as runtime, reliability, and resilience to noise, assessed over both small and large networks, and with both limited and big data. To facilitate comparisons with future studies, we have made all data, raw results, graphs and BN models freely available online

    New approaches to genetic therapies for cystic fibrosis

    Get PDF
    Gene therapy offers great promise for cystic fibrosis which has never been quite fulfilled due to the challenges of delivering sufficient amounts of the CFTR gene and expression persistence for a sufficient period of time in the lungs to have any effect. Initial trials explored both viral and non-viral vectors but failed to achieve a significant breakthrough. However, in recent years, new opportunities have emerged that exploit our increased knowledge and understanding of the biology of CF and the airway epithelium. New technologies include new viral and non-viral vector approaches to delivery, but also alternative nucleic acid technologies including oligonucleotides and siRNA approaches for gene silencing and gene splicing, described in this review, as presented at the 2019 annual European CF Society Basic Science meeting (Dubrovnik, Croatia). We also briefly discuss other emerging technologies including mRNA and CRISPR gene editing that are advancing rapidly. The future prospects for genetic therapies for CF are now diverse and more promising probably than any time since the discovery of the CF gene

    Automated annotation and visualisation of high-resolution spatial proteomic mass spectrometry imaging data using HIT-MAP.

    Full text link
    Spatial proteomics has the potential to significantly advance our understanding of biology, physiology and medicine. Matrix-assisted laser desorption/ionisation mass spectrometry imaging (MALDI-MSI) is a powerful tool in the spatial proteomics field, enabling direct detection and registration of protein abundance and distribution across tissues. MALDI-MSI preserves spatial distribution and histology allowing unbiased analysis of complex, heterogeneous tissues. However, MALDI-MSI faces the challenge of simultaneous peptide quantification and identification. To overcome this, we develop and validate HIT-MAP (High-resolution Informatics Toolbox in MALDI-MSI Proteomics), an open-source bioinformatics workflow using peptide mass fingerprint analysis and a dual scoring system to computationally assign peptide and protein annotations to high mass resolution MSI datasets and generate customisable spatial distribution maps. HIT-MAP will be a valuable resource for the spatial proteomics community for analysing newly generated and retrospective datasets, enabling robust peptide and protein annotation and visualisation in a wide array of normal and disease contexts

    Identification of Individual Glandular Regions Using LCWT and Machine Learning Techniques

    Full text link
    A new approach for the segmentation of gland units in histological images is proposed with the aim of contributing to the improvement of the prostate cancer diagnosis. Clustering methods on several colour spaces are applied to each sample in order to generate a binary mask of the different tissue components. From the mask of lumen candidates, the Locally Constrained Watershed Transform (LCWT) is applied as a novel gland segmentation technique never before used in this type of images. 500 random gland candidates, both benign and pathological, are selected to evaluate the LCWT technique providing results of Dice coefficient of 0.85. Several shape and textural descriptors in combination with contextual features and a fractal analysis are applied, in a novel way, on different colour spaces achieving a total of 297 features to discern between artefacts and true glands. The most relevant features are then selected by an exhaustive statistical analysis in terms of independence between variables and dependence with the class. 3.200 artefacts, 3.195 benign glands and 3.000 pathological glands are obtained, from a data set of 1468 images at 10x magnification. A careful strategy of data partition is implemented to robustly address the classification problem between artefacts and glands. Both linear and non-linear approaches are considered using machine learning techniques based on Support Vector Machines (SVM) and feedforward neural networks achieving values of sensitivity, specificity and accuracy of 0.92, 0.97 and 0.95, respectivelyThis work has been funded by the Ministry of Economy, Industry and Competitiveness under the SICAP project (DPI2016-77869-C2-1-R). The work of Adri´an Colomer has been supported by the Spanish FPI Grant BES-2014-067889. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this researchGarcía-Pardo, JG.; Colomer, A.; Naranjo Ornedo, V.; Peñaranda, F.; Sales, MÁ. (2018). Identification of Individual Glandular Regions Using LCWT and Machine Learning Techniques. En Intelligent Data Engineering and Automated Learning – IDEAL 2018. Springer. 642-650. https://doi.org/10.1007/978-3-030-03493-1_67S642650Gleason, D.F.: Histologic grading and clinical staging of prostatic carcinoma. In: Urologic Pathology (1977)Naik, S., Doyle, S., Feldman, M., Tomaszewski, J., Madabhushi, A.: Gland segmentation and computerized gleason grading of prostate histology by integrating low-, high-level and domain specific information. In: MIAAB Workshop, pp. 1–8 (2007)Nguyen, K., Sabata, B., Jain, A.K.: Prostate cancer grading: gland segmentation and structural features. Pattern Recogn. Lett. 33(7), 951–961 (2012)Kwak, J.T., Hewitt, S.M.: Multiview boosting digital pathology analysis of prostate cancer. Comput. Methods Programs Biomed. 142, 91–99 (2017)Ren, J., Sadimin, E., Foran, D.J., Qi, X.: Computer aided analysis of prostate histopathology images to support a refined gleason grading system. In: SPIE Medical Imaging, International Society for Optics and Photonics, p. 101331V (2017)Soille, P.: Morphological Image Analysis: Principles and Applications. Springer, Berlin (2013)Nguyen, K., Sarkar, A., Jain, A.K.: Structure and context in prostatic gland segmentation and classification. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012. LNCS, vol. 7510, pp. 115–123. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33415-3_15Beare, R.: A locally constrained watershed transform. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1063–1074 (2006)Gertych, A., et al.: Machine learning approaches to analyze histological images of tissues from radical prostatectomies. Comput. Med. Imaging Graph. 46, 197–208 (2015)Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)Guo, Z., Zhang, L., Zhang, D.: A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process. 19(6), 1657–1663 (2010)Huang, P., Lee, C.: Automatic classification for pathological prostate images based on fractal analysis. IEEE Trans. Med. Imaging 28(7), 1037–1050 (2009)Ruifrok, A.C., Johnston, D.A., et al.: Quantification of histochemical staining by color deconvolution. Anal. Quant. Cytol. Histol. 23(4), 291–299 (2001

    Parental ethnic-racial socialization practices and the construction of children of color's ethnic-racial identity: A research synthesis and meta-analysis

    Get PDF
    Parental ethnic-racial socialization practices help shape the development of a strong ethnic-racial identity in children of color, which in turn contributes positively to mental health, social, and academic outcomes. Although there is a wide body of literature on the relationship between these meta-constructs, this research has not been systematically examined to either (a) determine the degree to which associations between parental ethnic-racial socialization approaches and ethnic-racial identity dimensions hold actual practical significance for parents of color or (b) estimate how these associations vary as a function of theorized mitigating factors. In response, this meta-analytic study investigated the strength of the association between parental ethnic-racial socialization practices and the construction of ethnic-racial identity, as well as factors that moderated the strength and direction of this association. Findings revealed that across 68 studies, there was a significant and substantive relationship between the global constructs of ethnic-racial socialization practices and ethnic-racial identity. Most individual practices of ethnic-racial socialization were positively associated with global ethnic-racial identity, and the strongest relationship was with pride and heritage socialization. Parental ethnic-racial socialization was also positively associated with all ethnic-racial identity dimensions tested except for public regard, with which it was negatively associated. Developmental findings showed that although ethnic-racial socialization positively predicted identity at every level of schooling, the strongest relationship was at the high school level. Finally, the association between ethnic-racial socialization and ethnic-racial identity was positive for African Americans, Latinxs, and Asian Americans alike, but the strongest relationship was among Latinxs. Implications for parenting practices and future research are discussed

    Deep Convolutional Neural Networks for Breast Cancer Histology Image Analysis

    Full text link
    Breast cancer is one of the main causes of cancer death worldwide. Early diagnostics significantly increases the chances of correct treatment and survival, but this process is tedious and often leads to a disagreement between pathologists. Computer-aided diagnosis systems showed potential for improving the diagnostic accuracy. In this work, we develop the computational approach based on deep convolution neural networks for breast cancer histology image classification. Hematoxylin and eosin stained breast histology microscopy image dataset is provided as a part of the ICIAR 2018 Grand Challenge on Breast Cancer Histology Images. Our approach utilizes several deep neural network architectures and gradient boosted trees classifier. For 4-class classification task, we report 87.2% accuracy. For 2-class classification task to detect carcinomas we report 93.8% accuracy, AUC 97.3%, and sensitivity/specificity 96.5/88.0% at the high-sensitivity operating point. To our knowledge, this approach outperforms other common methods in automated histopathological image classification. The source code for our approach is made publicly available at https://github.com/alexander-rakhlin/ICIAR2018Comment: 8 pages, 4 figure

    Small RNA Profile in Moso Bamboo Root and Leaf Obtained by High Definition Adapters

    Get PDF
    Moso bamboo (Phyllostachy heterocycla cv. pubescens L.) is an economically important fast-growing tree. In order to gain better understanding of gene expression regulation in this important species we used next generation sequencing to profile small RNAs in leaf and roots of young seedlings. Since standard kits to produce cDNA of small RNAs are biased for certain small RNAs, we used High Definition adapters that reduce ligation bias. We identified and experimentally validated five new microRNAs and a few other small non-coding RNAs that were not microRNAs. The biological implication of microRNA expression levels and targets of microRNAs are discussed

    Ori-Finder: A web-based system for finding oriCs in unannotated bacterial genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chromosomal replication is the central event in the bacterial cell cycle. Identification of replication origins (<it>oriC</it>s) is necessary for almost all newly sequenced bacterial genomes. Given the increasing pace of genome sequencing, the current available software for predicting <it>oriC</it>s, however, still leaves much to be desired. Therefore, the increasing availability of genome sequences calls for improved software to identify <it>oriC</it>s in newly sequenced and unannotated bacterial genomes.</p> <p>Results</p> <p>We have developed Ori-Finder, an online system for finding <it>oriC</it>s in bacterial genomes based on an integrated method comprising the analysis of base composition asymmetry using the <it>Z</it>-curve method, distribution of DnaA boxes, and the occurrence of genes frequently close to <it>oriC</it>s. The program can also deal with unannotated genome sequences by integrating the gene-finding program ZCURVE 1.02. Output of the predicted results is exported to an HTML report, which offers convenient views on the results in both graphical and tabular formats.</p> <p>Conclusion</p> <p>A web-based system to predict replication origins of bacterial genomes has been presented here. Based on this system, <it>oriC </it>regions have been predicted for the bacterial genomes available in GenBank currently. It is hoped that Ori-Finder will become a useful tool for the identification and analysis of <it>oriC</it>s in both bacterial and archaeal genomes.</p
    corecore