476 research outputs found

    Proceedings, MSVSCC 2018

    Get PDF
    Proceedings of the 12th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 19, 2018 at VMASC in Suffolk, Virginia. 155 pp

    Early detection of morbidity in feedlot cattle using pattern recognition techniques

    Get PDF
    Computer algorithms are routinely used to aid in the identification of biological patterns not easily detected with standard statistics. Currently, observed changes in normal patterns of feeding behavior (FB) are used to identify morbid feedlot cattle. The objective of this study was to use pattern classification techniques to develop algorithms capable of identifying morbid (M) cattle earlier than traditional pen checking methods. In two separate studies, individual feeding behaviour was obtained from 384 feedlot steers (228 ± 22.7 kg, initial BW) in a 226 d trial (model dataset), and 384 feedlot heifers (322 ± 34.7 kg, initial BW) in a 142 d trial (naive dataset). Data was collected using an automated feed bunk monitoring system. FB variables calculated included feeding duration, inter-meal interval (min., max., avg., SD and total; min/d) and feeding frequency (visits/d). Animal health records including the number of times treated, d in the hospital and d on feed were also collected. Ninety-three and 53 morbid (M) animals were identified in each trial respectively, and were categorized into low, moderate and high groups, based on severity of sickness. FB data for 68 cattle from the model dataset (45 classified as Moderate and 25 classified as High) was analyzed to develop an algorithm which would aid in identifying morbid FB. This algorithm was later tested on 18 M animals (12 classified as Moderate and 6 as High) in the naive dataset. The pattern recognition procedure involved reducing data dimensionality via Principal Component Analysis, followed by K-means clustering and finally the development of a binary string to aid in the classification of M feeding behaviour. The developed procedure resulted in an overall classification accuracy of 84 % (82.5 and 85 % accuracy for H and M, respectively) for the model dataset, and 75 % overall (100 and 50 % accuracy for H and M, respectively) for the naive dataset. The model predicted morbidity on average 3.3 and 1.2 d earlier than pen checkers could for each trial respectively. The application of pattern recognition algorithms to FB shows value as a method of identifying morbid cattle in advance of overt physical signs of morbidity

    Genome-guided bioprospecting for novel antibiotic lead compounds.

    Get PDF
    Antimicrobial resistance continues to pose a threat to health and wellbeing. Unmitigated, it is predicted to be the leading cause of death by 2050. Hence, the sustained development of novel antibiotics is crucial. As over 60% of licensed antibiotics are based on scaffolds derived from less than 1% of all known bacterial species, bacterial secondary metabolites constitute an untapped source of novel antibiotics. The aim of this project therefore was to expand the chemical space of bacteria-derived antibiotic lead compounds, using genomics approach. To that end, a topsoil sample was collected from the rhizosphere in which antibiosis occurs naturally. Using starvation stress, sixty-five isolates were recovered from the sample, out of which four were selected based on morphology and designated A13BB, A23BA, A13AA and A23AA. A13BB was identified by 16S rRNA gene sequence comparison as a Pseudomonas spp. and the other three isolates as Hafnia/Obesumbacterium spp. A database search showed that species belonging to these genera have genomes larger than the 3 Mb size above which an increasing proportion of a bacterial genome is dedicated to secondary metabolism. Given their ecological origin, expected genome size and ability to withstand starvation stress, these four isolates were presumed to harbour antibiotic-encoding gene clusters. Isolates A13BB and A23BA were therefore selected for genome mining in the first instance. Illumina and GridION/MinION sequencing data were obtained for both isolates and assembled into high-quality genomes. Isolates' identities were confirmed by FastANI analysis as strains of P. fragi and H. alvei, with 4.94 and 4.77 Mb genomes, respectively. Assembled genomes were mined with antiSMASH. Amongst other secondary metabolite biosynthetic gene clusters (smBGCs) detected, the β-lactone smBGCs in both genomes were selected for activation as their end products bear the hallmarks of an 'ideal antibiotic' that can inhibit several bacteria-specific enzymes simultaneously. Analysis of these smBGCs revealed genes encoding two core enzymes: 2-isopropylmalate synthase (2-IPMS) and acyl CoA ligase homologues. In the biosynthetic pathway, 2-IPMS catalyses the condensation of acetyl CoA with the degradation product of valine or isoleucine to form 2-IPM. 2-IPM is isomerised to 3-IPM which then forms the β-lactone warhead through reactions catalysed by acyl CoA ligase. It was speculated that the β-lactone compound is biosynthesised to efficiently rid the organism of potentially harmful metabolic intermediates as it grows on poor carbon and nitrogen sources. Strain fermentation was therefore performed with 10.8 mM acetate as the main carbon source, and 5 mM L-valine or L-isoleucine as the nitrogen source. Fermentation extracts were analysed by LC-MS with at least thirty-seven metabolite ions detected. Many of these ions have masses in the range m/z 230-750, which is an ideal mass range for antibiotic molecules. As β-lactone compounds are difficult to identify in crude extracts, especially when utilising single-stage mass spectrometry, reactivity-guided screening of extracts with cysteine thiol probe was performed as the probe forms UV- and MS-visible adducts with β-lactone compounds. However, complete dimerization of probe at a faster-than-expected rate in extract matrices hindered successful screening. This meant that it was not possible to determine if any crude extract components were β-lactone compounds without further analysis. Measures to limit or eliminate probe dimerization are proposed, together with molecular networking strategies that can afford global visualisation and rapid dereplication of extract components, using tandem mass spectrometry fragmentation patterns of parent ions. This project provides an original and robust workflow that serves as a strong starting point in the isolation of novel β-lactone compounds from crude extracts, followed by structural optimisation and bioactivity profiling. The hitherto unrecognised potential of β-lactone natural compounds as 'ideal antibiotics' is highlighted, and several structural optimisation strategies required to harness this potential are proposed. The genomes assembled here, and associated data have been deposited in the repositories of the International Nucleotide Sequence Database Collaboration for repurposing by other researchers. Likewise, the hidden metabolic and biosynthetic potentials of P. fragi and H. alvei species uncovered by RASTtk and antiSMASH analyses have been catalogued and placed in the public domain, with many of these attributes reported for the first time

    Mining Effective Multi-Segment Sliding Window for Pathogen Incidence Rate Prediction

    Get PDF
    Pathogen incidence rate prediction, which can be considered as time series modeling, is an important task for infectious disease incidence rate prediction and for public health. This paper investigates the application of a genetic computation technique, namely GEP, for pathogen incidence rate prediction. To overcome the shortcomings of traditional sliding windows in GEP-based time series modeling, the paper introduces the problem of mining effective sliding window, for discovering optimal sliding windows for building accurate prediction models. To utilize the periodical characteristic of pathogen incidence rates, a multi-segment sliding window consisting of several segments from different periodical intervals is proposed and used. Since the number of such candidate windows is still very large, a heuristic method is designed for enumerating the candidate effective multi-segment sliding windows. Moreover, methods to find the optimal sliding window and then produce a mathematical model based on that window are proposed. A performance study on real-world datasets shows that the techniques are effective and efficient for pathogen incidence rate prediction

    Crop Disease Detection Using Remote Sensing Image Analysis

    Get PDF
    Pest and crop disease threats are often estimated by complex changes in crops and the applied agricultural practices that result mainly from the increasing food demand and climate change at global level. In an attempt to explore high-end and sustainable solutions for both pest and crop disease management, remote sensing technologies have been employed, taking advantages of possible changes deriving from relative alterations in the metabolic activity of infected crops which in turn are highly associated to crop spectral reflectance properties. Recent developments applied to high resolution data acquired with remote sensing tools, offer an additional tool which is the opportunity of mapping the infected field areas in the form of patchy land areas or those areas that are susceptible to diseases. This makes easier the discrimination between healthy and diseased crops, providing an additional tool to crop monitoring. The current book brings together recent research work comprising of innovative applications that involve novel remote sensing approaches and their applications oriented to crop disease detection. The book provides an in-depth view of the developments in remote sensing and explores its potential to assess health status in crops

    Genomics and spatial surveillance of Chagas disease and American visceral leishmaniasis

    Get PDF
    The Trypanosomatidae are a family of parasitic protozoa that infect various animals and plants. Several species within the Trypanosoma and Leishmania genera also pose a major threat to human health. Among these are Trypanosoma cruzi and Leishmania infantum, aetiological agents of the highly debilitating and often deadly vector-borne zoonoses Chagas disease and American visceral leishmaniasis. Current treatment options are far from safe, only partially effective and rarely available in the impoverished regions of Latin America where these ‘neglected tropical diseases’ prevail. Wider-reaching, sustainable protection against T. cruzi and L. infantum might best be achieved by intercepting key routes of zoonotic transmission, but this prophylactic approach requires a better understanding of how these parasites disperse and evolve at various spatiotemporal scales. This dissertation addresses key questions around trypanosomatid parasite biology and spatial epidemiology based on high-resolution, geo-referenced DNA sequence datasets constructed from disease foci throughout Latin America: Which forms of genetic exchange occur in T. cruzi, and are exchange events frequent enough to significantly alter the distribution of important epidemiological traits? How do demographic histories, for example, the recent invasive expansion of L. infantum into the Americas, impact parasite population structure, and do structural changes pose a threat to public health? Can environmental variables predict parasite dispersal patterns at the landscape scale? Following the first chapter’s review of population genetic and genomic approaches in the study of trypanosomatid diseases in Latin America, Chapter 2 describes how reproductive polymorphism segregates T. cruzi populations in southern Ecuador. The study is the first to clearly demonstrate meiotic sex in this species, for decades thought to exchange genetic material only very rarely, and only by non-Mendelian means. T. cruzi subpopulations from the Ecuadorian study site exhibit all major hallmarks of sexual reproduction, including genome-wide Hardy-Weinberg allele frequencies, rapid decay of linkage disequilibrium with map distance and genealogies that fluctuate among chromosomes. The presence of sex promotes the transfer and transformation of genotypes underlying important epidemiological traits, posing great challenges to disease surveillance and the development of diagnostics and drugs. Chapter 3 demonstrates that mating events are also pivotal to L. infantum population structure in Brazil, where introduction bottlenecks have led to striking genetic discontinuities between sympatric strains. Genetic hybridization occurs genome-wide, including at a recently identified ‘miltefosine sensitivity locus’ that appears to be deleted from the majority of Brazilian L. infantum genomes. The study combines an array of genomic and phenotypic analyses to determine whether rapid population expansion or strong purifying selection has driven this prominent > 12 kb deletion to high abundance across Brazil. Results expose deletion size differences that covary with phylogenetic structure and suggest that deletion-carrying strains do not form a private monophyletic clade. These observations are inconsistent with the hypothesis that the deletion genotype rose to high prevalence simply as the result of a founder effect. Enzymatic assays show that loss of ecto-3’-nucleotidase gene function within the deleted locus is coupled to increased ecto-ATPase activity, raising the possibility that alternative metabolic strategies enhance L. infantum fitness in its introduced range. The study also uses demographic simulation modelling to determine whether L. infantum populations in the Americas have expanded from just one or multiple introduction events. Comparison of observed vs. simulated summary statistics using random forests suggests a single introduction from the Old World, but better spatial sampling coverage is required to rule out other demographic scenarios in a pattern-process modelling approach. Further sampling is also necessary to substantiate signs of convergent selection introduced above. Chapter 4 therefore develops a ‘genome-wide locus sequence typing’ (GLST) tool to summarize parasite genetic polymorphism at a fraction of genomic sequencing cost. Applied directly to the infection source (e.g., vector or host tissue), the method also avoids bias from cell purification and culturing steps typically involved prior to sequencing of trypanosomatid and other obligate parasite genomes. GLST scans genomic pilot data for hundreds of polymorphic sequence fragments whose thermodynamic properties permit simultaneous PCR amplification in a single reaction tube. For proof of principle, GLST is applied to metagenomic DNA extracts from various Chagas disease vector species collected in Colombia, Venezuela, and Ecuador. Epimastigote DNA from several T. cruzi reference clones is also analyzed. The method distinguishes 387 single-nucleotide polymorphisms (SNPs) in T. cruzi sub-lineage TcI and an additional 393 SNPs in non-TcI clones. Genetic distances calculated from these SNPs correlate with geographic distances among samples but also distinguish parasites from triatomines collected at common collection sites. The method thereby appears suitable for agent-based spatio-genetic (simulation) analyses left wanted by Chapter 3 – and further formulated in Chapter 5. The potential to survey parasite genetic diversity abundantly across landscapes compels deeper, more systematic exploration of how environmental variables influence the spread of disease. As environmental context is only marginally considered in the population genetic analyses of Chapters 2 – 4, Chapter 5 proposes a new, spatially explicit modelling framework to predict vector-borne parasite gene flow through heterogeneous environment. In this framework, remotely sensed environmental raster values are re-coded and merged into a composite ‘resistance surface’ that summarizes hypothesized effects of landscape features on parasite transmission among vectors and hosts. Parasite population genetic differentiation is then simulated on this surface and fitted to observed diversity patterns in order to evaluate original hypotheses on how environmental variables modulate parasite gene flow. The chapter thereby makes a maiden step from standard population genetic to ‘landscape genomic’ approaches in understanding the ecology and evolution of vector-borne disease. In summary, this dissertation first demonstrates the power of population genetics and genomics to understand fundamental biological properties of important protist parasites, then identifies areas where analytical tools are missing and creates new technical and conceptual frameworks to help fill these gaps. The general discussion (Chapter 6) also outlines several follow-up projects on the key finding of meiotic genetic signatures in T. cruzi. Exploiting recently developed T. cruzi genome-editing systems for the detection of meiotic gene expression and heterozygosis will help understand why and in which life cycle stage some parasite populations use sex and others do not. Long-read sequencing of parental and recombinant genomes will help understand the extent to which sex is diversifying T. cruzi phenotypes, especially virulence and drug resistance properties conferred by surface molecules with repetitive genetic bases intractable to short-read analysis. Chapter 6 also provides follow-up plans for all other research chapters. Emphasis is placed on advancing the complementarity, transferability and public health benefit of the many different methods and concepts employed in this work

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    Data Science, Data Visualization, and Digital Twins

    Get PDF
    Real-time, web-based, and interactive visualisations are proven to be outstanding methodologies and tools in numerous fields when knowledge in sophisticated data science and visualisation techniques is available. The rationale for this is because modern data science analytical approaches like machine/deep learning or artificial intelligence, as well as digital twinning, promise to give data insights, enable informed decision-making, and facilitate rich interactions among stakeholders.The benefits of data visualisation, data science, and digital twinning technologies motivate this book, which exhibits and presents numerous developed and advanced data science and visualisation approaches. Chapters cover such topics as deep learning techniques, web and dashboard-based visualisations during the COVID pandemic, 3D modelling of trees for mobile communications, digital twinning in the mining industry, data science libraries, and potential areas of future data science development
    • …
    corecore