10,228 research outputs found

    Fast and scalable inference of multi-sample cancer lineages.

    Get PDF
    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee

    CHAPTER 8: DWIGHT READ: TOWARDS A NEW PARADIGM: FOLLOWED BY A DISCUSSION BETWEEN THE AUTHOR AND DWIGHT READ

    Get PDF
    Here I report on Dwight Read’s theory for a paradigm change in kinship anthropology which entails kinship terminologies being interpreted as symbolic computational systems based on kin-term products. I also report on how Read argues that different conceptualizations of sibling, either sibling resulting by descent from parent, or sibling viewed in terms of shared parentage, two cultural conceptions that are rendered – here exemplifying the masculine side – by the kin-term products, S o F = B [son of father = brother] or F o B = F [father of brother = father), lead to respectively building up a descriptive or a classificatory terminology. The chapter also deals with how Dwight Read accounts for the relationship between genealogical tracing and the working out of kin terms using kin-term products and how the logic of kin-term products is consistent with the extension of kin terms to kin-type categories beyond the primary ones.The paper also reports on a discussion between Dwight Read and the author, initiated by questions and observations from the latter, regarding different aspects of Read’s reasoning. Not exhaustively, to be mentioned here is the way kin relationships are concretely worked out using kin-term products, the model of the family space and the nuclear family, group marriage,  how the conceptualization of sibling in terms of shared parentage expressed through the kin-term product F o B = F [father o brother = father] relates to ethnographic data, the nature of the logic of kinship terminologies, the status of the structural equation S o F = B [son o father = brother] when used within the context of a classificatory terminology, the axiomatic nature of a number of kin-term products pertaining to specific kin terminologies, the equations pertaining to classificatory kinship terminologies that are  likely to algebraically reduce chains of kin-terms products, mapped from corresponding kin type strings, like “son of son of father of father of father” (S o S o F o F o F) is mapped from the collateral genealogical relations, father’s father’s father’s son’s son (fffss or fffbss) to an irreducible kin term, here father, which is the one native speakers use for the said genealogical connection.The discussion also addresses, taking the example of ancient Chinese dialects, the question of what should be the structural prerequisites for a transition from classificatory (Dravidian) terminologies into bifurcate collateral and descriptive terminologies, a transition that is often posited by a number of linguists and anthropologists. Finally, the discussion deals with the question as to whether the kinship terminologies of the world all ultimately derive from a pre-dispersal African Proto-Sapiens kinship terminology. Throughout these lines of discussion, the central question is raised as to why different cultural choices on how siblings are conceptualized were made that led to different human kinship terminologies and social structures

    On Approximating Four Covering and Packing Problems

    Get PDF
    In this paper, we consider approximability issues of the following four problems: triangle packing, full sibling reconstruction, maximum profit coverage and 2-coverage. All of them are generalized or specialized versions of set-cover and have applications in biology ranging from full-sibling reconstructions in wild populations to biomolecular clusterings; however, as this paper shows, their approximability properties differ considerably. Our inapproximability constant for the triangle packing problem improves upon the previous results; this is done by directly transforming the inapproximability gap of Haastad for the problem of maximizing the number of satisfied equations for a set of equations over GF(2) and is interesting in its own right. Our approximability results on the full siblings reconstruction problems answers questions originally posed by Berger-Wolf et al. and our results on the maximum profit coverage problem provides almost matching upper and lower bounds on the approximation ratio, answering a question posed by Hassin and Or.Comment: 25 page

    Learning loopy graphical models with latent variables: Efficient methods and guarantees

    Get PDF
    The problem of structure estimation in graphical models with latent variables is considered. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider models where the underlying Markov graph is locally tree-like, and the model is in the regime of correlation decay. For the special case of the Ising model, the number of samples nn required for structural consistency of our method scales as n=Ω(θminδη(η+1)2logp)n=\Omega(\theta_{\min}^{-\delta\eta(\eta+1)-2}\log p), where p is the number of variables, θmin\theta_{\min} is the minimum edge potential, δ\delta is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η\eta is a parameter which depends on the bounds on node and edge potentials in the Ising model. Necessary conditions for structural consistency under any algorithm are derived and our method nearly matches the lower bound on sample requirements. Further, the proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1070 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Genetic analysis of a major international collection of cultivated apple varieties reveals previously unknown historic heteroploid and inbred relationships

    Get PDF
    Domesticated apple (Malus x domestica Borkh.) is a major global crop and the genetic diversity held within the pool of cultivated varieties is important for the development of future cultivars. The aim of this study was to investigate the diversity held within the domesticated form, through the analysis of a major international germplasm collection of cultivated varieties, the UK National Fruit Collection, consisting of over 2,000 selections of named cultivars and seedling varieties. We utilised Diversity Array Technology (DArT) markers to assess the genetic diversity within the collection. Clustering attempts, using the software STRUCTURE revealed that the accessions formed a complex and historically admixed group for which clear clustering was challenging. Comparison of accessions using the Jaccard similarity coefficient allowed us to identify clonal and duplicate material as well as revealing pairs and groups that appeared more closely related than a standard parent-offspring or full-sibling relations. From further investigation, we were able to propose a number of new pedigrees, which revealed that some historically important cultivars were more closely related than previously documented and that some of them were partially inbred. We were also able to elucidate a number of parent-offspring relationships that had resulted in a number of important polyploid cultivars. This included reuniting polyploid cultivars that in some cases dated as far back as the 18th century, with diploid parents that potentially date back as far as the 13th century

    Non-invasive genetic monitoring involving citizen science enables reconstruction of current pack dynamics in a re-establishing wolf population

    Get PDF
    Background: Carnivores are re-establishing in many human-populated areas, where their presence is often contentious. Reaching consensus on management decisions is often hampered by a dispute over the size of the local carnivore population. Understanding the reproductive dynamics and individual movements of the carnivores can provide support for management decisions, but individual-level information can be difficult to obtain from elusive, wideranging species. Non-invasive genetic sampling can yield such information, but makes subsequent reconstruction of population history challenging due to incomplete population coverage and error-prone data. Here, we combine a collaborative, volunteer-based sampling scheme with Bayesian pedigree reconstruction to describe the pack dynamics of an establishing grey wolf (Canis lupus) population in south-west Finland, where wolf breeding was recorded in 2006 for the first time in over a century. Results: Using DNA extracted mainly from faeces collected since 2008, we identified 81 individual wolves and assigned credible full parentages to 70 of these and partial parentages to a further 9, revealing 7 breeding pairs. Individuals used a range of strategies to obtain breeding opportunities, including dispersal to established or new packs, long-distance migration and inheriting breeding roles. Gene flow occurred between all packs but inbreeding events were rare. Conclusions: These findings demonstrate that characterizing ongoing pack dynamics can provide detailed, locally-relevant insight into the ecology of contentious species such as the wolf. Involving various stakeholders in data collection makes these results more likely to be accepted as unbiased and hence reliable grounds for management decisions.Peer reviewe
    corecore