15 research outputs found

    Using Pan-Genomic Data Structures to Incoporate Diversity Into Genomic Analyses

    Get PDF
    The alignment of sequencing reads to the reference genome is a process subject to reference bias, a phenomenon where reads containing alternative alleles have a smaller likelihood of aligning to the reference when compared to reads that are more similar to the reference. Because the human reference genome is largely comprised of the genomic sequence of a single individual, it is apparent that either changing or modifying the representation of the reference genome in order to incorporate diversity from other individuals can reduce reference bias. We discuss methods for alleviating reference bias through the use of novel text indexing data structures and algorithms that can incorporate such diversity. First, we present data structures built on top of the Run-Length FM Index that can be used to index and query a pan-genome, ie. a representation of the genome that incorporates known variation within the species. Then, we use pan-genome indexes in a workflow for constructing a personalized genome from a set of sequencing reads. This personalized genome can be used in lieu of the reference genome during alignment in order to alleviate reference bias. We also discuss how alignments against personalized genomes can be used in downstream analyses by "lifting" these alignments back over to the reference genome

    Pangenomic Genotyping with the Marker Array

    Get PDF
    We present a new method and software tool called rowbowt that applies a pangenome index to the problem of inferring genotypes from short-read sequencing data. The method uses a novel indexing structure called the marker array. Using the marker array, we can genotype variants with respect from large panels like the 1000 Genomes Project while avoiding the reference bias that results when aligning to a single linear reference. rowbowt can infer accurate genotypes in less time and memory compared to existing graph-based methods

    Computer software for generating digital cadastral databases

    Get PDF
    The most recent and major spatial information technology development in Malaysia is the establishment of the Nalional Land Information System (NaUS). It is a system which enable the exchange and sharing of la1Id related information between government bodies. private agencies and general public. Computerization and digital data are the keywords in the new venture. With this in mind. Center for Geographic Information & Analysis (CGIA) has taken a step forward in the direction of establishing a Digital Cadastral Databases (DCDB). This paper describes the development of a software package for generaling DCDB being developed by CGIA

    Soil erosion analysis using prime meridian GIS package

    Get PDF
    A number of soil erosion models have been developed to predict and characterize the movement of soil. These models provide an understanding of the dynamics of soil and can be used to evaluate the effectiveness of land management practices. Nevertheless, modellers faced some major problems. They are the inability to efficiently handle, manipulate and manage large volumes of model parameters. However, recent developments in Geographic Information Systems (GIS) provide the opportunities and tools to spatially organize and effectively manage data for the modelling. Thus, this research aims to develop an interactive soil erosion modelling and prediction system within a GIS environment. Prime Meridian GIS' Spatial Database Engine by Essential Planning System Pte. Ltd. (EPS), will be the core of the designed system. This system is designed to use the universal soil loss equation and the overlay modelling technique to generate the soil erosion risk map. The goal of developing the system is to provide a spatial decision support tool in environmental impact assessment as required by the Department of Environment, Ministry of Science Technology and Environment
    corecore