Search CORE

664 research outputs found

Reconstructing pedigrees: some identifiability questions for a recombination-mutation model

Author: BD McKay
BD Thatte
Bhalchandra D. Thatte
C Semple
H Whitney
J Pearl
JBS Haldane
JT Chang
K Lange
KA Zareckiĭ
L Lovász
M Steel
M Steel
O Bininda-Emonds
SM Ulam
T Petrie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/09/2011
Field of study

Pedigrees are directed acyclic graphs that represent ancestral relationships between individuals in a population. Based on a schematic recombination process, we describe two simple Markov models for sequences evolving on pedigrees - Model R (recombinations without mutations) and Model RM (recombinations with mutations). For these models, we ask an identifiability question: is it possible to construct a pedigree from the joint probability distribution of extant sequences? We present partial identifiability results for general pedigrees: we show that when the crossover probabilities are sufficiently small, certain spanning subgraph sequences can be counted from the joint distribution of extant sequences. We demonstrate how pedigrees that earlier seemed difficult to distinguish are distinguished by counting their spanning subgraph sequences.Comment: 40 pages, 9 figure

arXiv.org e-Print Archive

Crossref

The era of the ARG: an empiricist's guide to ancestral recombination graphs

Author: Bradburd Gideon S.
Grundler Michael C.
Lewanski Alexander L.
Publication venue
Publication date: 18/10/2023
Field of study

In the presence of recombination, the evolutionary relationships between a set of sampled genomes cannot be described by a single genealogical tree. Instead, the genomes are related by a complex, interwoven collection of genealogies formalized in a structure called an ancestral recombination graph (ARG). An ARG extensively encodes the ancestry of the genome(s) and thus is replete with valuable information for addressing diverse questions in evolutionary biology. Despite its potential utility, technological and methodological limitations, along with a lack of approachable literature, have severely restricted awareness and application of ARGs in empirical evolution research. Excitingly, recent progress in ARG reconstruction and simulation have made ARG-based approaches feasible for many questions and systems. In this review, we provide an accessible introduction and exploration of ARGs, survey recent methodological breakthroughs, and describe the potential for ARGs to further existing goals and open avenues of inquiry that were previously inaccessible in evolutionary genomics. Through this discussion, we aim to more widely disseminate the promise of ARGs in evolutionary genomics and encourage the broader development and adoption of ARG-based inference.Comment: 34 pages, 3 figures, 3 table

arXiv.org e-Print Archive

Graphics for relatedness research

Author: Abecasis
Aitchison
Aitchison
Andoh
Berg
Blouin
Boehnke
Bérénos
Cavalli-Sforza
Cotterman
Croft
Egozcue
Epstein
Foulkes
García-Magariños
Gonder
Hansen
Loughnan
Milligan
Moltke
Nembot-Simo
Oliehoek
Pawlowsky-Glahn
Pawlowsky-Glahn
Pemberton
R Core Team
Rosenberg
Rosenberg
Snyder-Mackler
Spencer
Stanley
Thompson
Wagner
Weir
Wickham
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

Studies of relatedness have been crucial in molecular ecology over the last decades. Good evidence of this is the fact that studies of population structure, evolution of social behaviours, genetic diversity and quantitative genetics all involve relatedness research. The main aim of this article is to review the most common graphical methods used in allele sharing studies for detecting and identifying family relationships. Both IBS and IBD based allele sharing studies are considered. Furthermore, we propose two additional graphical methods from the field of compositional data analysis: the ternary diagram and scatterplots of isometric log-ratios of IBS and IBD probabilities. We illustrate all graphical tools with genetic data from the HGDP-CEPH diversity panel, using mainly 377 microsatellites genotyped for 25 individuals from the Maya population of this panel. We enhance all graphics with convex hulls obtained by simulation and use these to confirm the documented relationships. The proposed compositional graphics are shown to be useful in relatedness research, as they also single out the most prominent related pairs. The ternary diagram is advocated for its ability to display all three allele sharing probabilities simultaneously. The log-ratio plots are advocated as an attempt to overcome the problems with the Euclidean distance interpretation in the classical graphics.Peer ReviewedPostprint (published version

Consortium of Academic Libraries of Catalonia (CBUC)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

DUGiDocs – Universitat de Girona

Diposit Digital de Documents de la UAB

Single nucleotide polymorphism-based dispersal estimates using noninvasive sampling

Author: Lynch M.
R Development Core
Rousset F.
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

Quantifying dispersal within wild populations is an important but challenging task. Here we present a method to estimate contemporary, individual-based dispersal distance from noninvasively collected samples using a specialized panel of 96 SNPs (single nucleotide polymorphisms). One main issue in conducting dispersal studies is the requirement for a high sampling resolution at a geographic scale appropriate for capturing the majority of dispersal events. In this study, fecal samples of brown bear (Ursus arctos) were collected by volunteer citizens, resulting in a high sampling resolution spanning over 45,000km(2) in Gavleborg and Dalarna counties in Sweden. SNP genotypes were obtained for unique individuals sampled (n=433) and subsequently used to reconstruct pedigrees. A Mantel test for isolation by distance suggests that the sampling scale was appropriate for females but not for males, which are known to disperse long distances. Euclidean distance was estimated between mother and offspring pairs identified through the reconstructed pedigrees. The mean dispersal distance was 12.9km (SE 3.2) and 33.8km (SE 6.8) for females and males, respectively. These results were significantly different (Wilcoxon's rank-sum test: P-value=0.02) and are in agreement with the previously identified pattern of male-biased dispersal. Our results illustrate the potential of using a combination of noninvasively collected samples at high resolution and specialized SNPs for pedigree-based dispersal models

Epsilon Open Archive

Crossref

PubMed Central

A general and efficient representation of ancestral recombination graphs

Author: Gorjanc Gregor
Ignatieva Anastasia
Kelleher Jerome
Koskela Jere
Wohns Anthony W
Wong Yan
Publication venue
Publication date: 03/11/2023
Field of study

Edinburgh Research Explorer

A genetic algorithm based method for stringent haplotyping of family data

Author: C Lamina
D Levine
D Qian
E Sobel
Francois Besnier
GR Abecasis
J Akey
J Hernández-Sánchez
JBS Haldane
JJ Windig
L Crooks
L Excoffier
L Grapes
M Lynch
M Stephens
P Tapadar
PJ Boettcher
T Becker
THE Meuwissen
Örjan Carlborg
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The linkage phase, or haplotype, is an extra level of information that in addition to genotype and pedigree can be useful for reconstructing the inheritance pattern of the alleles in a pedigree, and computing for example Identity By Descent probabilities. If a haplotype is provided, the precision of estimated IBD probabilities increases, as long as the haplotype is estimated without errors. It is therefore important to only use haplotypes that are strongly supported by the available data for IBD estimation, to avoid introducing new errors due to erroneous linkage phases. Results We propose a genetic algorithm based method for haplotype estimation in family data that includes a stringency parameter. This allows the user to decide the error tolerance level when inferring parental origin of the alleles. This is a novel feature compared to existing methods for haplotype estimation. We show that using a high stringency produces haplotype data with few errors, whereas a low stringency provides haplotype estimates in most situations, but with an increased number of errors. Conclusion By including a stringency criterion in our haplotyping method, the user is able to maintain the error rate at a suitable level for the particular study; one can select anything from haplotyped data with very small proportion of errors and a higher proportion of non-inferred haplotypes, to data with phase estimates for every marker, when haplotype errors are tolerable. Giving this choice makes the method more flexible and useful in a wide range of applications as it is able to fulfil different requirements regarding the tolerance for haplotype errors, or uncertain marker-phases.</p

Crossref

Directory of Open Access Journals

Publikationer från Uppsala Universitet

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line