274 research outputs found

    Role of deep learning in infant brain MRI analysis

    Get PDF
    Deep learning algorithms and in particular convolutional networks have shown tremendous success in medical image analysis applications, though relatively few methods have been applied to infant MRI data due numerous inherent challenges such as inhomogenous tissue appearance across the image, considerable image intensity variability across the first year of life, and a low signal to noise setting. This paper presents methods addressing these challenges in two selected applications, specifically infant brain tissue segmentation at the isointense stage and presymptomatic disease prediction in neurodevelopmental disorders. Corresponding methods are reviewed and compared, and open issues are identified, namely low data size restrictions, class imbalance problems, and lack of interpretation of the resulting deep learning solutions. We discuss how existing solutions can be adapted to approach these issues as well as how generative models seem to be a particularly strong contender to address them

    Improving Engagement Assessment by Model Individualization and Deep Learning

    Get PDF
    This dissertation studies methods that improve engagement assessment for pilots. The major work addresses two challenging problems involved in the assessment: individual variation among pilots and the lack of labeled data for training assessment models. Task engagement is usually assessed by analyzing physiological measurements collected from subjects who are performing a task. However, physiological measurements such as Electroencephalography (EEG) vary from subject to subject. An assessment model trained for one subject may not be applicable to other subjects. We proposed a dynamic classifier selection algorithm for model individualization and compared it to other two methods: base line normalization and similarity-based model replacement. Experimental results showed that baseline normalization and dynamic classifier selection can significantly improve cross-subject engagement assessment. For complex tasks such as piloting an air plane, labeling engagement levels for pilots is challenging. Without enough labeled data, it is very difficult for traditional methods to train valid models for effective engagement assessment. This dissertation proposed to utilize deep learning models to address this challenge. Deep learning models are capable of learning valuable feature hierarchies by taking advantage of both labeled and unlabeled data. Our results showed that deep models are better tools for engagement assessment when label information is scarce. To further verify the power of deep learning techniques for scarce labeled data, we applied the deep learning algorithm to another small size data set, the ADNI data set. The ADNI data set is a public data set containing MRI and PET scans of Alzheimer\u27s Disease (AD) patients for AD diagnosis. We developed a robust deep learning system incorporating dropout and stability selection techniques to identify the different progression stages of AD patients. The experimental results showed that deep learning is very effective in AD diagnosis. In addition, we studied several imbalance learning techniques that are useful when data is highly unbalanced, i.e., when majority classes have many more training samples than minority classes. Conventional machine learning techniques usually tend to classify all data samples into majority classes and to perform poorly for minority classes. Unbalanced learning techniques can balance data sets before training and can improve learning performance

    Advances in Forensic Genetics

    Get PDF
    The book has 25 articles about the status and new directions in forensic genetics. Approximately half of the articles are invited reviews, and the remaining articles deal with new forensic genetic methods. The articles cover aspects such as sampling DNA evidence at the scene of a crime; DNA transfer when handling evidence material and how to avoid DNA contamination of items, laboratory, etc.; identification of body fluids and tissues with RNA; forensic microbiome analysis with molecular biology methods as a supplement to the examination of human DNA; forensic DNA phenotyping for predicting visible traits such as eye, hair, and skin colour; new ancestry informative DNA markers for estimating ethnic origin; new genetic genealogy methods for identifying distant relatives that cannot be identified with conventional forensic DNA typing; sensitive DNA methods, including single-cell DNA analysis and other highly specialised and sensitive methods to examine ancient DNA from unidentified victims of war; forensic animal genetics; genetics of visible traits in dogs; statistical tools for interpreting forensic DNA analyses, including the most used IT tools for forensic STR-typing and DNA sequencing; haploid markers (Y-chromosome and mitochondria DNA); inference of ethnic origin; a comprehensive logical framework for the interpretation of forensic genetic DNA data; and an overview of the ethical aspects of modern forensic genetics

    acdc – Automated Contamination Detection and Confidence estimation for single-cell genome data

    Get PDF
    Lux M, Krüger J, Rinke C, et al. acdc – Automated Contamination Detection and Confidence estimation for single-cell genome data. BMC Bioinformatics. 2016;17(1): 543.Background A major obstacle in single-cell sequencing is sample contamination with foreign DNA. To guarantee clean genome assemblies and to prevent the introduction of contamination into public databases, considerable quality control efforts are put into post-sequencing analysis. Contamination screening generally relies on reference-based methods such as database alignment or marker gene search, which limits the set of detectable contaminants to organisms with closely related reference species. As genomic coverage in the tree of life is highly fragmented, there is an urgent need for a reference-free methodology for contaminant identification in sequence data. Results We present acdc, a tool specifically developed to aid the quality control process of genomic sequence data. By combining supervised and unsupervised methods, it reliably detects both known and de novo contaminants. First, 16S rRNA gene prediction and the inclusion of ultrafast exact alignment techniques allow sequence classification using existing knowledge from databases. Second, reference-free inspection is enabled by the use of state-of-the-art machine learning techniques that include fast, non-linear dimensionality reduction of oligonucleotide signatures and subsequent clustering algorithms that automatically estimate the number of clusters. The latter also enables the removal of any contaminant, yielding a clean sample. Furthermore, given the data complexity and the ill-posedness of clustering, acdc employs bootstrapping techniques to provide statistically profound confidence values. Tested on a large number of samples from diverse sequencing projects, our software is able to quickly and accurately identify contamination. Results are displayed in an interactive user interface. Acdc can be run from the web as well as a dedicated command line application, which allows easy integration into large sequencing project analysis workflows. Conclusions Acdc can reliably detect contamination in single-cell genome data. In addition to database-driven detection, it complements existing tools by its unsupervised techniques, which allow for the detection of de novo contaminants. Our contribution has the potential to drastically reduce the amount of resources put into these processes, particularly in the context of limited availability of reference species. As single-cell genome data continues to grow rapidly, acdc adds to the toolkit of crucial quality assurance tools

    Advanced extravehicular activity systems requirements definition study. Phase 2: Extravehicular activity at a lunar base

    Get PDF
    The focus is on Extravehicular Activity (EVA) systems requirements definition for an advanced space mission: remote-from-main base EVA on the Moon. The lunar environment, biomedical considerations, appropriate hardware design criteria, hardware and interface requirements, and key technical issues for advanced lunar EVA were examined. Six remote EVA scenarios (three nominal operations and three contingency situations) were developed in considerable detail

    Efficient Grouping Methods for the Annotation and Sorting of Single Cells

    Get PDF
    Lux M. Efficient Grouping Methods for the Annotation and Sorting of Single Cells. Bielefeld: Universität Bielefeld; 2018.Insights into large-scale biological data require computational methods which reliably and efficiently recognize latent structures and patterns. In many cases, it is necessary to find homogeneous subgroups of the data in order to solve complex problems and to enable the discovery of novel knowledge. Here, clustering and classification techniques are commonly employed in all fields of research. Confounding factors often complicate data analysis and require a thorough choice of methods and parameters. This thesis is focused on methods around single-cell research - I developed, evaluated, compared and adapted grouping methods for open problems from three different technologies: First, metagenomics is typically confronted with the problem of detecting clusters representing involved species in a given sample (binning). Albeit powerful technologies exist for the identification of known taxa, de novo binning is still in its infancy. In this context, I evaluated optimal choices of techniques and parameters regarding the integration of modern machine learning methods, such as dimensionality reduction and clustering, resulting in an automated binning pipeline. Second, in single-cell sequencing, a major problem is given by sample contamination with foreign genomic material. From a computational point of view, in both metagenomics and single-cell genome assemblies, genomes can be represented as clusters. Contrary to metagenomics, the clustering task for single cells is a fundamentally different one. Here, I developed a methodology to automatically detect contamination and estimate confidences in single-cell genome assemblies. A third challenge can be seen in the field of flow cytometry. Here, the precise identification of cell populations in a sample is crucial and requires manual, tedious, and possibly biased cell annotation. Automated methods exist, however they require difficult fine-tuning of hyper-parameters to obtain the best results. To overcome this limitation, I developed a semi-supervised tool for cell population identification, with few very robust parameters, being fast, accurate and interpretable at the same time

    Validation of resistome signatures through the application of a machine learning prediction algorithm on metagenomic data

    Get PDF
    Dissertação de Mestrado Integrado em Medicina Veterinária, área científica de Sanidade AnimalABSTRACT- Metagenomic data has been increasingly used in antimicrobial resistance (AMR) studies, but there is still a need for accurate and reliable methods for predicting the relative attribution of AMR determinants to different animal reservoirs. AMR data availability has increased exponentially over the past few years, as has global awareness of the threat that AMR poses to public health, often known as the silent pandemic. This has led to an upsurge in interest in applying machine learning to AMR data. In this study, shot-gun sequences were used from fecal samples of pigs, broilers, turkeys, and veal calves, previously collected during national cross-sectional studies across Europe. The data used in this study corresponded to these samples and their associated relative abundance of AMR determinants. A random forest (RF) model was developed to investigate the relative attribution of AMR determinants to those different reservoirs. Additionally, a descriptive analysis was made to further investigate the 15 most important variables for the RF model. A principal component analysis (PCA) and all-subsets regression were performed to identify reservoir-specific AMR determinants. Ultimately, the reservoir-specific AMR determinants identified here were compared with the resistome signatures identified in a previous study. The results demonstrated that the RF model successfully classified resistomes into corresponding reservoir classes, with high accuracy and reliability. The RF model had more difficulty differentiating pig from veal and broiler from turkey, indicating the similarity of resistome composition between each of these two species. The analyses validated several AMR determinants as resistome signatures of specific animal reservoirs, such as tet(40) and sul2 of veal, tet(Q), mef(A) and cfxA2 of veal and pig, blaTEM-126 of broiler, and tet(A) of broiler and turkey. This study describes a reliable and accurate method for the relative attribution of AMR determinants to different animal reservoirs using metagenomic data. Such results are essential for effective surveillance and control of AMR in animal and human populationsRESUMO - Validação de resistome-signatures através da aplicação de um algoritmo de previsão de machine learning em dados metagenómicos - Dados metagenómicos têm sido cada vez mais usados em estudos de resistência aos antimicrobianos, mas ainda há uma escassez de métodos precisos e fidedignos para prever a atribuição relativa de genes de resistência a diferentes espécies animais. A disponibilidade de dados de resistência aos antimicrobianos aumentou exponencialmente nos últimos anos, assim como a consciencialização global sobre a ameaça que as resistências representam para a saúde pública, geralmente conhecida como pandemia silenciosa. Isto levou a um aumento no interesse em aplicar métodos de machine learning a esses dados. Neste estudo, sequências shot-gun foram usadas a partir de amostras fecais de porcos, frangos, perús e vitelos, recolhidas anteriormente durante estudos nacionais por toda a Europa. Os dados utilizados neste estudo corresponderam a essas amostras e os seus valores FPKM associados. Um modelo de random forest (RF) foi desenvolvido para prever a atribuição relativa de gene de resistência para essas diferentes espécies. Além disso, uma análise descritiva foi feita para investigar melhor as 15 variáveis mais importantes para o modelo de RF. Uma análise de componentes principais (PCA) e regressão all-subsets foram realizadas para identificar genes de resistência específicos de certas espécies. Por fim, esses genes específicos aqui identificados foram comparados com os resistome-signatures identificados num estudo anterior. Os nossos resultados demonstraram que o modelo classificou com sucesso as amostras em classes de espécies correspondentes, com alta precisão e confiabilidade. O modelo teve mais dificuldade em diferenciar porco de vitela, e frango de perú, indicando uma semelhança da composição do resistoma entre cada uma dessas duas espécies. Esta análise validou vários genes como resistome-signatures de animais específicos, como tet(40) e sul2 de vitelos, tet(Q), mef(A) e cfxA2 de vitelos e porcos, blaTEM-126 de frangos, e tet(A) de frangos e perús. Este estudo descreve um método confiável e preciso para a atribuição relativa de genes de resistência a diferentes reservatórios animais usando dados metagenómicos. Estes resultados são essenciais para a vigilância e controlo das resistências aos antimicrobianos em populações animais e humanasN/
    • …
    corecore