9 research outputs found

    APPLICATIONS OF MACHINE LEARNING IN MICROBIAL FORENSICS

    Get PDF
    Microbial ecosystems are complex, with hundreds of members interacting with each other and the environment. The intricate and hidden behaviors underlying these interactions make research questions challenging – but can be better understood through machine learning. However, most machine learning that is used in microbiome work is a black box form of investigation, where accurate predictions can be made, but the inner logic behind what is driving prediction is hidden behind nontransparent layers of complexity. Accordingly, the goal of this dissertation is to provide an interpretable and in-depth machine learning approach to investigate microbial biogeography and to use micro-organisms as novel tools to detect geospatial location and object provenance (previous known origin). These contributions follow with a framework that allows extraction of interpretable metrics and actionable insights from microbiome-based machine learning models. The first part of this work provides an overview of machine learning in the context of microbial ecology, human microbiome studies and environmental monitoring – outlining common practice and shortcomings. The second part of this work demonstrates a field study to demonstrate how machine learning can be used to characterize patterns in microbial biogeography globally – using microbes from ports located around the world. The third part of this work studies the persistence and stability of natural microbial communities from the environment that have colonized objects (vessels) and stay attached as they travel through the water. Finally, the last part of this dissertation provides a robust framework for investigating the microbiome. This framework provides a reasonable understanding of the data being used in microbiome-based machine learning and allows researchers to better apprehend and interpret results. Together, these extensive experiments assist an understanding of how to carry an in-silico design that characterizes candidate microbial biomarkers from real world settings to a rapid, field deployable diagnostic assay. The work presented here provides evidence for the use of microbial forensics as a toolkit to expand our basic understanding of microbial biogeography, microbial community stability and persistence in complex systems, and the ability of machine learning to be applied to downstream molecular detection platforms for rapid and accurate detection

    Vers l’anti-criminalistique en images numériques via la restauration d’images

    Get PDF
    Image forensics enjoys its increasing popularity as a powerful image authentication tool, working in a blind passive way without the aid of any a priori embedded information compared to fragile image watermarking. On its opponent side, image anti-forensics attacks forensic algorithms for the future development of more trustworthy forensics. When image coding or processing is involved, we notice that image anti-forensics to some extent shares a similar goal with image restoration. Both of them aim to recover the information lost during the image degradation, yet image anti-forensics has one additional indispensable forensic undetectability requirement. In this thesis, we form a new research line for image anti-forensics, by leveraging on advanced concepts/methods from image restoration meanwhile with integrations of anti-forensic strategies/terms. Under this context, this thesis contributes on the following four aspects for JPEG compression and median filtering anti-forensics: (i) JPEG anti-forensics using Total Variation based deblocking, (ii) improved Total Variation based JPEG anti-forensics with assignment problem based perceptual DCT histogram smoothing, (iii) JPEG anti-forensics using JPEG image quality enhancement based on a sophisticated image prior model and non-parametric DCT histogram smoothing based on calibration, and (iv) median filtered image quality enhancement and anti-forensics via variational deconvolution. Experimental results demonstrate the effectiveness of the proposed anti-forensic methods with a better forensic undetectability against existing forensic detectors as well as a higher visual quality of the processed image, by comparisons with the state-of-the-art methods.La criminalistique en images numériques se développe comme un outil puissant pour l'authentification d'image, en travaillant de manière passive et aveugle sans l'aide d'informations d'authentification pré-intégrées dans l'image (contrairement au tatouage fragile d'image). En parallèle, l'anti-criminalistique se propose d'attaquer les algorithmes de criminalistique afin de maintenir une saine émulation susceptible d'aider à leur amélioration. En images numériques, l'anti-criminalistique partage quelques similitudes avec la restauration d'image : dans les deux cas, l'on souhaite approcher au mieux les informations perdues pendant un processus de dégradation d'image. Cependant, l'anti-criminalistique se doit de remplir au mieux un objectif supplémentaire, extit{i.e.} : être non détectable par la criminalistique actuelle. Dans cette thèse, nous proposons une nouvelle piste de recherche pour la criminalistique en images numériques, en tirant profit des concepts/méthodes avancés de la restauration d'image mais en intégrant des stratégies/termes spécifiquement anti-criminalistiques. Dans ce contexte, cette thèse apporte des contributions sur quatre aspects concernant, en criminalistique JPEG, (i) l'introduction du déblocage basé sur la variation totale pour contrer les méthodes de criminalistique JPEG et (ii) l'amélioration apportée par l'adjonction d'un lissage perceptuel de l'histogramme DCT, (iii) l'utilisation d'un modèle d'image sophistiqué et d'un lissage non paramétrique de l'histogramme DCT visant l'amélioration de la qualité de l'image falsifiée; et, en criminalistique du filtrage médian, (iv) l'introduction d'une méthode fondée sur la déconvolution variationnelle. Les résultats expérimentaux démontrent l'efficacité des méthodes anti-criminalistiques proposées, avec notamment une meilleure indétectabilité face aux détecteurs criminalistiques actuels ainsi qu'une meilleure qualité visuelle de l'image falsifiée par rapport aux méthodes anti-criminalistiques de l'état de l'art

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Characterisation of xenometabolome signatures in complex biomatrices for enhanced human population phenotyping

    Get PDF
    Metabolic phenotyping facilitates the analysis of low molecular weight compounds in complex biological samples, with resulting metabolite profiles providing a window on endogenous processes and xenobiotic exposures. Accurate characterisation of the xenobiotic component of the metabolome (the xenometabolome) is particularly valuable when metabolic phenotyping is used for epidemiological and clinical population studies where exposure of participants to xenobiotics is unknown or difficult to control/estimate. Additionally, as metabolic phenotyping has increasingly been incorporated into toxicology and drug metabolism research, phenotyping datasets may be exploited to study xenobiotic metabolism at the population level. This thesis describes novel analytical and data-driven strategies for broadening xenometabolome coverage to allow effective partitioning of endogenous and xenobiotic metabolome signatures. The data driven strategy was multi-faceted, involving the generation of a reference database and the application of statistical methodologies. The database contains over 100 common xenobiotics profiles - generated using established liquid chromatography-mass-spectrometry methods – and provided the basis for an empirically derived screen for human urine and blood samples. The prevalence of these xenobiotics was explored in an exemplar phenotyping dataset (ALZ; n = 650; urine), with 31 xenobiotics detected in an initial screen. Statistical based methods were tailored to extract xenobiotic-related signatures and evaluated using drugs with well-characterised human metabolism. To complement the data-driven strategies for xenometabolome coverage, a more analytical based strategy was additionally developed. A dispersive solid phase extraction sample preparation protocol for blood products was optimised, permitting efficient removal of lipids and proteins, with minimal effect on low molecular weight metabolites. The suitability and reproducibility of this method was evaluated in two independent blood sample sets (AZstudy12; n=171, MARS; n=285). Finally, these analytical and statistical strategies were applied to two existing large-scale phenotyping study datasets: AIRWAVE (n = 3000 urine, n=3000 plasma samples) and ALZ (n= 650 urine, n= 449 serum) and used to explore both xenobiotic and endogenous responses to triclosan and polyethylene glycol exposure. Exposure to triclosan highlighted affected pathways relating to sulfation, whilst exposure to PEG highlighted a possible perturbation in the glutathione cycle. The analytical and statistical strategies described in this thesis allow for a more comprehensive xenometabolome characterisation and have been used to uncover previously unreported relationships between xenobiotic and endogenous metabolism.Open Acces

    Genetic Algorithms for Feature Selection and Classification of Complex Chromatographic and Spectroscopic Data

    Get PDF
    A basic methodology for analyzing large multivariate chemical data sets based on feature selection is proposed. Each chromatogram or spectrum is represented as a point in a high dimensional measurement space. A genetic algorithm for feature selection and classification is applied to the data to identify features that optimize the separation of the classes in a plot of the two or three largest principal components of the data. A good principal component plot can only be generated using features whose variance or information is primarily about differences between classes in the data. Hence, feature subsets that maximize the ratio of between-class to within-class variance are selected by the pattern recognition genetic algorithm. Furthermore, the structure of the data set can be explored, for example, new classes can be discovered by simply tuning various parameters of the fitness function of the pattern recognition genetic algorithm. The proposed method has been validated on a wide range of data. A two-step procedure for pattern recognition analysis of spectral data has been developed. First, wavelets are used to denoise and deconvolute spectral bands by decomposing each spectrum into wavelet coefficients, which represent the samples constituent frequencies. Second, the pattern recognition genetic algorithm is used to identify wavelet coefficients characteristic of the class. In several studies involving spectral library searching, this method was employed. In one study, a search pre-filter to detect the presence of carboxylic acids from vapor phase infrared spectra which has previously eluted prominent researchers has been successfully formulated and validated. In another study, this same approach has been used to develop a pattern recognition assisted infrared library searching technique to determine the model, manufacturer, and year of the vehicle from which a clear coat paint smear originated. The pattern recognition genetic algorithm has also been used to develop a potential method to identify molds in indoor environments using volatile organic compounds. A distinct profile indicative of microbial volatile organic compounds was developed from air sampling data that could be readily differentiated from the blank for both high mold count and moderate mold count exposure samples. The utility of the pattern recognition genetic algorithm for discovery of biomarker candidates from genomic and proteomic data sets has also been shown.Chemistry Departmen

    Development of novel mass spectrometric methods for point-of-care mucosal diagnostics

    Get PDF
    Human mucosal surfaces act as key interfaces between microbiota and host. As such, mucosal sampling using medical swabs is performed for diagnostic purposes that most commonly rely upon subsequent microscopy, culture or molecular-based assays. These approaches are limited in providing information on host response, which is a critical facet of pathology. In this thesis, I sought to test the hypothesis that both presence of specific microbes as well as their interactions with the human host are reflected in the mucosal metabolome and that this information could be exploited for mucosal diagnostic applications. The study aimed to develop a method for rapid, direct metabolic profiling from swabs using desorption electrospray ionisation mass spectrometry (DESI-MS). Method optimisation was conducted to elucidate optimal instrumental and geometrical conditions essential for the swab analysis. The application of the method for mucosal diagnostics was then assessed by characterising the metabolic profile of multiple bodysites (oral, nasal and vaginal mucosa), vaginal mucosa during two different physiological states (non-pregnant vs pregnant) and to detect a pathological state (bacterial vaginosis). Correlation of DESI-MS vaginal metabolic profiles with matched vaginal microbiota composition (VMC) characterised by 16S rRNA-based metataxonomics during pregnancy enabled to robustly predict a Lactobacillus dominant from depleted state but also major vaginal community states types (CST). The predictive performance of DESI-MS based models was comparable to “gold standard” LC-MS based models. Additionally, bacterial metabolite markers predictive of specific microbial genera were identified through matching to a spectral database constructed using pure cultures of commensal and pathogenic microbes often observed in the vaginal microbiome. In summary, DESI-MS has the potential to revolutionise the current way of mucosal based diagnostic by reducing significantly the time-demand needed for the characterisation of VMC, drug or inflammatory response to only few minutes and therefore could enable a faster decision making on patient’s treatment.Open Acces

    Cognitive Foundations for Visual Analytics

    Full text link
    corecore