6,094 research outputs found

    Fixed-Parameter Algorithms for Computing Kemeny Scores - Theory and Practice

    Full text link
    The central problem in this work is to compute a ranking of a set of elements which is "closest to" a given set of input rankings of the elements. We define "closest to" in an established way as having the minimum sum of Kendall-Tau distances to each input ranking. Unfortunately, the resulting problem Kemeny consensus is NP-hard for instances with n input rankings, n being an even integer greater than three. Nevertheless this problem plays a central role in many rank aggregation problems. It was shown that one can compute the corresponding Kemeny consensus list in f(k) + poly(n) time, being f(k) a computable function in one of the parameters "score of the consensus", "maximum distance between two input rankings", "number of candidates" and "average pairwise Kendall-Tau distance" and poly(n) a polynomial in the input size. This work will demonstrate the practical usefulness of the corresponding algorithms by applying them to randomly generated and several real-world data. Thus, we show that these fixed-parameter algorithms are not only of theoretical interest. In a more theoretical part of this work we will develop an improved fixed-parameter algorithm for the parameter "score of the consensus" having a better upper bound for the running time than previous algorithms.Comment: Studienarbei

    High-Density Linkage Maps Based on Genotyping-by-Sequencing (GBS) Confirm a Chromosome-Level Genome Assembly and Reveal Variation in Recombination Rate for the Pacific Oyster Crassostrea gigas

    Get PDF
    Studies of linkage and linkage mapping have advanced genetic and biological knowledge for over 100 years. In addition to their growing role, today, in mapping phenotypes to genotypes, dense linkage maps can help to validate genome assemblies. Previously, we showed that 40% of scaffolds in the first genome assembly for the Pacific oyster Crassostrea gigas were chimeric, containing single nucleotide polymorphisms (SNPs) mapping to different linkage groups. Here, we merge 14 linkage maps constructed of SNPs generated from genotyping-by-sequencing (GBS) methods with five, previously constructed linkage maps, to create a compendium of nearly 69 thousand SNPs mapped with high confidence. We use this compendium to assess a recently available, chromosome-level assembly of the C. gigas genome, mapping SNPs in 275 of 301 contigs and comparing the ordering of these contigs, by linkage, to their assembly by Hi-C sequencing methods. We find that, while 26% of contigs contain chimeric blocks of SNPs, i.e., adjacent SNPs mapping to different linkage groups than the majority of SNPs in their contig, these apparent misassemblies amount to only 0.08% of the genome sequence. Furthermore, nearly 90% of 275 contigs mapped by linkage and sequencing are assembled identically; inconsistencies between the two assemblies for the remaining 10% of contigs appear to result from insufficient linkage information. Thus, our compilation of linkage maps strongly supports this chromosome-level assembly of the oyster genome. Finally, we use this assembly to estimate, for the first time in a Lophotrochozoan, genome-wide recombination rates and causes of variation in this fundamental process

    Two-Phase Analysis in Consensus Genetic Mapping

    Get PDF
    Numerous mapping projects conducted on different species have generated an abundance of mapping data. Consequently, many multilocus maps have been constructed using diverse mapping populations and marker sets for the same organism. The quality of maps varies broadly among populations, marker sets, and software used, necessitating efforts to integrate the mapping information and generate consensus maps. The problem of consensus genetic mapping (MCGM) is by far more challenging compared with genetic mapping based on a single dataset, which by itself is also cumbersome. The additional complications introduced by consensus analysis include inter-population differences in recombination rate and exchange distribution along chromosomes; variations in dominance of the employed markers; and use of different subsets of markers in different labs. Hence, it is necessary to handle arbitrary patterns of shared sets of markers and different level of mapping data quality. In this article, we introduce a two-phase approach for solving MCGM. In phase 1, for each dataset, multilocus ordering is performed combined with iterative jackknife resampling to evaluate the stability of marker orders. In this phase, the ordering problem is reduced to the well-known traveling salesperson problem (TSP). Namely, for each dataset, we look for order that gives minimum sum of recombination distances between adjacent markers. In phase 2, the optimal consensus order of shared markers is selected from the set of allowed orders and gives the minimal sum of total lengths of nonconflicting maps of the chromosome. This criterion may be used in different modifications to take into account the variation in quality of the original data (population size, marker quality, etc.). In the foregoing formulation, consensus mapping is considered as a specific version of TSP that can be referred to as “synchronized TSP.” The conflicts detected after phase 1 are resolved using either a heuristic algorithm over the entire chromosome or an exact/heuristic algorithm applied subsequently to the revealed small non-overlapping regions with conflicts separated by non-conflicting regions. The proposed approach was tested on a wide range of simulated data and real datasets from maize

    The Effect of Neighborhood Crime Rates on Childhood Obesity in Los Angeles County

    Get PDF
    This thesis examines the effect of neighborhood crime rates on childhood obesity in Los Angeles County over a five-year period 2012-2016. Using yearly pooled cross-sectional geocoded data from the University of Southern California (USC) Price Center for Social Innovation Neighborhood Data for Social Change (NDSC) interactive platform, I run multiple ordinary least squares regressions using different measures of crime to determine if neighborhoods with higher crime rates influence the unhealthy percentage of 5th, 7th, and 9thgrade public school students. I hypothesize that crime influences obesity, violent crime has a stronger correlation than property crime, and that greater parks access reduces obesity. My regression results fail to support hypotheses one and two. Hypothesis three is supported by the available data

    Overweight and obesity in adult patients with phenylketonuria: a systematic review.

    Get PDF
    Excess weight is a rising concern in patients with phenylketonuria (PKU). It is commonly observed in children and adolescents with PKU; but data on adults are inconsistent. This review aims to summarize available data on excess weight in adult PKU individuals. We conducted a systematic search of literature in English, from inception to October 2021, on PubMed and Embase to identify articles on overweight and obesity in adult PKU patients. Prevalence of overweight and obesity, body mass index (BMI) and gender differences were the outcomes of interest. Of 260 articles identified, only 8 fulfilled quality criteria for inclusion after screening of titles, abstracts and full texts. The mean BMI of adult PKU patients in these studies ranged from 26 ± 5.4 to 30.3 ± 1.8 kg/m <sup>2</sup> . When compared to matched controls, adult PKU patients had higher BMI and higher prevalence of obesity. However, results were inconsistent when PKU adults were compared to the general population. The prevalence of obesity in the included studies varied widely between 4.5% up to 72% in individual studies. Obesity was 2-3 times more frequent in female PKU patients. Excess weight is frequent in adult PKU patients, especially in females, even if the difference with the general population is debatable. The heterogeneity of the studies makes it difficult to interpret the results and the factors that contribute to obesity. Content of the diet, psychological status, diet-associated disordered eating, patient's social environment and lifestyle are listed as potentials contributors to excess weight in PKU adult population. Further studies are needed to better elucidate this question. In the meantime, weight control and healthy eating habits should be considered in the management and follow-up of these patients

    The plastic genome of Bordetella pertussis

    Get PDF

    PARALLEL INDEPENDENT COMPONENT ANALYSIS WITH REFERENCE FOR IMAGING GENETICS: A SEMI-BLIND MULTIVARIATE APPROACH

    Get PDF
    Imaging genetics is an emerging field dedicated to the study of genetic underpinnings of brain structure and function. Over the last decade, brain imaging techniques such as magnetic resonance imaging (MRI) have been increasingly applied to measure morphometry, task-based function and connectivity in living brains. Meanwhile, high-throughput genotyping employing genome-wide techniques has made it feasible to sample the entire genome of a substantial number of individuals. While there is growing interest in image-wide and genome-wide approaches which allow unbiased searches over a large range of variants, one of the most challenging problems is the correction for the huge number of statistical tests used in univariate models. In contrast, a reference-guided multivariate approach shows specific advantage for simultaneously assessing many variables for aggregate effects while leveraging prior information. It can improve the robustness of the results compared to a fully blind approach. In this dissertation we present a semi-blind multivariate approach, parallel independent component analysis with reference (pICA-R), to better reveal relationships between hidden factors of particular attributes. First, a consistency-based order estimation approach is introduced to advance the application of ICA to genotype data. The pICA-R approach is then presented, where independent components are extracted from two modalities in parallel and inter-modality associations are subsequently optimized for pairs of components. In particular, prior information is incorporated to elicit components of particular interests, which helps identify factors carrying small amounts of variance in large complex datasets. The pICA-R approach is further extended to accommodate multiple references whose interrelationships are unknown, allowing the investigation of functional influence on neurobiological traits of potentially related genetic variants implicated in biology. Applied to a schizophrenia study, pICA-R reveals that a complex genetic factor involving multiple pathways underlies schizophrenia-related gray matter deficits in prefrontal and temporal regions. The extended multi-reference approach, when employed to study alcohol dependence, delineates a complex genetic architecture, where the CREB-BDNF pathway plays a key role in the genetic factor underlying a proportion of variation in cue-elicited brain activations, which plays a role in phenotypic symptoms of alcohol dependence. In summary, our work makes several important contributions to advance the application of ICA to imaging genetics studies, which holds the promise to improve our understating of genetics underlying brain structure and function in healthy and disease

    A gene-rich linkage map in the dioecious species Actinidia chinensis (kiwifruit) reveals putative X/Y sex-determining chromosomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genus <it>Actinidia </it>(kiwifruit) consists of woody, scrambling vines, native to China, and only recently propagated as a commercial crop. All species described are dioecious, but the genetic mechanism for sex-determination is unknown, as is the genetic basis for many of the cluster of characteristics making up the unique fruit. It is, however, an important crop in the New Zealand economy, and a classical breeding program would benefit greatly by knowledge of the trait alleles carried by both female and male parents. The application of marker assisted selection (MAS) in seedling populations would also aid the accurate and efficient development of novel fruit types for the market.</p> <p>Results</p> <p>Gene-rich female, male and consensus linkage maps of the diploid species <it>A. chinensis </it>have been constructed with 644 microsatellite markers. The maps consist of twenty-nine linkage groups corresponding to the haploid number n = 29. We found that sex-linked sequence characterized amplified region (SCAR) markers and the 'Flower-sex' phenotype consistently mapped to a single linkage group, in a subtelomeric region, in a section of inconsistent marker order. The region also contained markers of expressed genes, some of unknown function. Recombination, assessed by allelic distribution and marker order stability, was, in the remainder of the linkage group, in accordance with other linkage groups. Fully informative markers to other genes in this linkage group identified the comparative linkage group in the female map, where recombination ratios determining marker order were similar to the autosomes.</p> <p>Conclusion</p> <p>We have created genetic linkage maps that define the 29 linkage groups of the haploid genome, and have revealed the position and extent of the sex-determining locus in <it>A. chinensis</it>. As all <it>Actinidia </it>species are dioecious, we suggest that the sex-determining loci of other <it>Actinidia </it>species will be similar to that region defined in our maps. As the extent of the non-recombining region is limited, our result supports the suggestion that the subtelomeric region of an autosome is in the early stages of developing the characteristics of a sex chromosome. The maps provide a reference of genetic information in <it>Actinidia </it>for use in genetic analysis and breeding programs.</p

    The haplotype-resolved chromosome pairs of a heterozygous diploid African cassava cultivar reveal novel pan-genome and allele-specific transcriptome features

    Full text link
    Background: Cassava (Manihot esculenta) is an important clonally propagated food crop in tropical and subtropical regions worldwide. Genetic gain by molecular breeding has been limited, partially because cassava is a highly heterozygous crop with a repetitive and difficult-to-assemble genome. Findings: Here we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present 2 chromosome-scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. With consensus accuracy >QV46, contig N50 >18 Mb, BUSCO completeness of 99%, and 35k phased gene loci, it is the most accurate, continuous, complete, and haplotype-resolved cassava genome assembly so far. Ab initio gene prediction with RNA-seq data and Iso-Seq transcripts identified abundant novel gene loci, with enriched functionality related to chromatin organization, meristem development, and cell responses. During tissue development, differentially expressed transcripts of different haplotype origins were enriched for different functionality. In each tissue, 20-30% of transcripts showed allele-specific expression (ASE) differences. ASE bias was often tissue specific and inconsistent across different tissues. Direction-shifting was observed in <2% of the ASE transcripts. Despite high gene synteny, the HiFi genome assembly revealed extensive chromosome rearrangements and abundant intra-genomic and inter-genomic divergent sequences, with large structural variations mostly related to LTR retrotransposons. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding. Conclusions: The phased and annotated chromosome pairs allow a systematic view of the heterozygous diploid genome organization in cassava with improved accuracy, completeness, and haplotype resolution. They will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy, and continuity
    corecore