306 research outputs found

    Conflation of short identity-by-descent segments bias their inferred length distribution

    Full text link
    Identity-by-descent (IBD) is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to contain an IBD segment if they share a segment that is inherited from a recent shared common ancestor without intervening recombination. Long IBD segments (> 1cM) can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample. However, these approaches detect IBD based on contiguous segments of identity-by-state, and such segments may exist due to the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that nearly 40% of inferred segments 1-2cM long are results of conflations of two or more shorter segments, under demographic scenarios typical for modern humans. This biases the inferred IBD segment length distribution, and so can affect downstream inferences. We observed this conflation effect universally across different IBD detection programs and human demographic histories, and found inference of segments longer than 2cM to be much more reliable (less than 5% conflation rate). As an example of how this can negatively affect downstream analyses, we present and analyze a novel estimator of the de novo mutation rate using IBD segments, and demonstrate that the biased length distribution of the IBD segments due to conflation can lead to inflated estimates if the conflation is not modeled. Understanding the conflation effect in detail will make its correction in future methods more tractable

    KLFDAPC : a supervised machine learning approach for spatial genetic structure analysis

    Get PDF
    CSC-University of St Andrews Joint Scholarship (to X.Q.); International Postdoctoral Exchange Fellowship Program (Talent-Introduction Program) from China Postdoc Council (to X.Q.); National Institute of General Medical Sciences (NIGMS) of the National Institute of Health (grant R35GM142783 to C.W.K.C.). Part of the computation for this work is supported by USC’s Center for Advanced Research Computing (https://carc.usc.edu).Geographic patterns of human genetic variation provide important insights into human evolution and disease. A commonly used tool to detect and describe them is principal component analysis (PCA) or the supervised linear discriminant analysis of principal components (DAPC). However, genetic features produced from both approaches could fail to correctly characterize population structure for complex scenarios involving admixture. In this study, we introduce Kernel Local Fisher Discriminant Analysis of Principal Components (KLFDAPC), a supervised non-linear approach for inferring individual geographic genetic structure that could rectify the limitations of these approaches by preserving the multimodal space of samples. We tested the power of KLFDAPC to infer population structure and to predict individual geographic origin using neural networks. Simulation results showed that KLFDAPC has higher discriminatory power than PCA and DAPC. The application of our method to empirical European and East Asian genome-wide genetic datasets indicated that the first two reduced features of KLFDAPC correctly recapitulated the geography of individuals and significantly improved the accuracy of predicting individual geographic origin when compared to PCA and DAPC. Therefore, KLFDAPC can be useful for geographic ancestry inference, design of genome scans and correction for spatial stratification in GWAS that link genes to adaptation or disease susceptibility.Publisher PDFPeer reviewe

    What Would It Take to Overcome the Damaging Effects of Structural Racism and Ensure a More Equitable Future?

    Get PDF
    This report calls on civic leaders, advocates, elected officials, and philanthropists to address the legacy of structural racism in the United States and advance racial equity by taking steps to close four large equity gaps between people of color and white people.Based on interviews and discussions with experts, advocates, practitioners, and policy makers in the fields of wealth building, public education, employment, and justice policy, the report outlines solutions for each of the four interrelated disparities — in wealth, education, employment and earnings, and policing practices — arguing that greater equity in one area could lead to gains in others

    Unifying Parsimonious Tree Reconciliation

    Full text link
    Evolution is a process that is influenced by various environmental factors, e.g. the interactions between different species, genes, and biogeographical properties. Hence, it is interesting to study the combined evolutionary history of multiple species, their genes, and the environment they live in. A common approach to address this research problem is to describe each individual evolution as a phylogenetic tree and construct a tree reconciliation which is parsimonious with respect to a given event model. Unfortunately, most of the previous approaches are designed only either for host-parasite systems, for gene tree/species tree reconciliation, or biogeography. Hence, a method is desirable, which addresses the general problem of mapping phylogenetic trees and covering all varieties of coevolving systems, including e.g., predator-prey and symbiotic relationships. To overcome this gap, we introduce a generalized cophylogenetic event model considering the combinatorial complete set of local coevolutionary events. We give a dynamic programming based heuristic for solving the maximum parsimony reconciliation problem in time O(n^2), for two phylogenies each with at most n leaves. Furthermore, we present an exact branch-and-bound algorithm which uses the results from the dynamic programming heuristic for discarding partial reconciliations. The approach has been implemented as a Java application which is freely available from http://pacosy.informatik.uni-leipzig.de/coresym.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

    HIV-1 gp120 N-linked glycosylation differs between plasma and leukocyte compartments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>N-linked glycosylation is a major mechanism for minimizing virus neutralizing antibody response and is present on the Human Immunodeficiency Virus (HIV) envelope glycoprotein. Although it is known that glycosylation changes can dramatically influence virus recognition by the host antibody, the actual contribution of compartmental differences in N-linked glycosylation patterns remains unclear.</p> <p>Methodology and Principal Findings</p> <p>We amplified the <it>env </it>gp120 C2-V5 region and analyzed 305 clones derived from plasma and other compartments from 15 HIV-1 patients. Bioinformatics and Bayesian network analyses were used to examine N-linked glycosylation differences between compartments. We found evidence for cellspecific single amino acid changes particular to monocytes, and significant variation was found in the total number of N-linked glycosylation sites between patients. Further, significant differences in the number of glycosylation sites were observed between plasma and cellular compartments. Bayesian network analyses showed an interdependency between N-linked glycosylation sites found in our study, which may have immense functional relevance.</p> <p>Conclusion</p> <p>Our analyses have identified single cell/compartment-specific amino acid changes and differences in N-linked glycosylation patterns between plasma and diverse blood leukocytes. Bayesian network analyses showed associations inferring alternative glycosylation pathways. We believe that these studies will provide crucial insights into the host immune response and its ability in controlling HIV replication <it>in vivo</it>. These findings could also have relevance in shielding and evasion of HIV-1 from neutralizing antibodies.</p

    The role of caretakers in disease dynamics

    Full text link
    One of the key challenges in modeling the dynamics of contagion phenomena is to understand how the structure of social interactions shapes the time course of a disease. Complex network theory has provided significant advances in this context. However, awareness of an epidemic in a population typically yields behavioral changes that correspond to changes in the network structure on which the disease evolves. This feedback mechanism has not been investigated in depth. For example, one would intuitively expect susceptible individuals to avoid other infecteds. However, doctors treating patients or parents tending sick children may also increase the amount of contact made with an infecteds, in an effort to speed up recovery but also exposing themselves to higher risks of infection. We study the role of these caretaker links in an adaptive network models where individuals react to a disease by increasing or decreasing the amount of contact they make with infected individuals. We find that pure avoidance, with only few caretaker links, is the best strategy for curtailing an SIS disease in networks that possess a large topological variability. In more homogeneous networks, disease prevalence is decreased for low concentrations of caretakers whereas a high prevalence emerges if caretaker concentration passes a well defined critical value.Comment: 8 pages, 9 figure
    • …
    corecore