99 research outputs found
Localizing Ashkenazic Jews to Primeval Villages in the Ancient Iranian Lands of Ashkenaz
TheYiddishlanguageisover1,000yearsoldandincorporatesGerman,Slavic,andHebrewelements.TheprevalentviewclaimsYiddish
hasaGermanorigin,whereastheopposingviewpositsaSlavicoriginwithstrongIranianandweakTurkicsubstrata.Oneofthemajor
difficulties in deciding between these hypotheses is the unknown geographical origin of Yiddish speaking Ashkenazic Jews (AJs). An
analysis of 393 Ashkenazic, Iranian, and mountain Jews and over 600 non-Jewish genomes demonstrated that Greeks, Romans,
Iranians,andTurksexhibitthehighestgeneticsimilaritywithAJs.TheGeographicPopulationStructureanalysislocalizedmostAJsalong
major primeval trade routes in northeastern Turkey adjacent to primeval villages with names that may be derived from âAshkenaz.â
IranianandmountainJewswerelocalizedalongtraderoutesontheTurkeyâseasternborder.Lossofmaternalhaplogroupswasevident
in non-Yiddish speaking AJs. Our results suggest that AJs originated from a Slavo-Iranian confederation, which the Jews call
âAshkenazicâ (i.e., âScythianâ), though these Jews probably spoke Persian and/or Ossete. This is compatible with linguistic evidence
suggesting that Yiddish is a Slavic language created by Irano-Turko-Slavic Jewish merchants along the Silk Roads as a cryptic trade
language, spoken only by its originators to gain an advantage in trade. Later, in the 9th century, Yiddish underwent relexification by
adoptinganewvocabularythatconsistsofaminorityofGermanandHebrewandamajorityofnewlycoinedGermanoidandHebroid
elements that replaced most of the original Eastern Slavic and Sorbian vocabularies, while keeping the original grammars intact
HapZipper: sharing HapMap populations just got easier
The rapidly growing amount of genomic sequence data being generated and made publicly available necessitate the development of new data storage and archiving methods. The vast amount of data being shared and manipulated also create new challenges for network resources. Thus, developing advanced data compression techniques is becoming an integral part of data production and analysis. The HapMap project is one of the largest public resources of human single-nucleotide polymorphisms (SNPs), characterizing over 3 million SNPs genotyped in over 1000 individuals. The standard format and biological properties of HapMap data suggest that a dedicated genetic compression method can outperform generic compression tools. We propose a compression methodology for genetic data by introducing H ap Z ipper , a lossless compression tool tailored to compress HapMap data beyond benchmarks defined by generic tools such as gzip , bzip2 and lzma . We demonstrate the usefulness of H ap Z ipper by compressing HapMap 3 populations to <5% of their original sizes. H ap Z ipper is freely downloadable from https://bitbucket.org/pchanda/hapzipper/downloads/HapZipper.tar.bz
Toward high-resolution population genomics using archaeological samples
The term âancient DNAâ (aDNA) is coming of age, with over 1,200 hits in the PubMed database,
beginning in the early 1980s with the studies of âmolecular paleontologyâ. Rooted in cloning
and limited sequencing of DNA from ancient remains during the pre-PCR era, the field has
made incredible progress since the introduction of PCR and next-generation sequencing. Over
the last decade, aDNA analysis ushered in a new era in genomics and became the method of
choice for reconstructing the history of organisms, their biogeography, and migration routes,
with applications in evolutionary biology, population genetics, archaeogenetics, paleoepidemiology,
and many other areas. This change was brought by development of new strategies
for coping with the challenges in studying aDNA due to damage and fragmentation, scarce
samples, significant historical gaps, and limited applicability of population genetics methods. In this review, we describe the state-of-the-art achievements in aDNA studies, with particular focus
on human evolution and demographic history. We present the current experimental and theoretical
procedures for handling and analysing highly degraded aDNA. We also review the challenges
in the rapidly growing field of ancient epigenomics. Advancement of aDNA tools and
methods signifies a new era in population genetics and evolutionary medicine research
Cross-Species Analysis of Genic GC(3) Content and DNA Methylation Patterns
The GC content in the third codon position (GC3) exhibits a unimodal distribution in many plant and animal genomes. Interestingly, grasses and homeotherm vertebrates exhibit a unique bimodal distribution. High GC3 was previously found to be associated with variable expression, higher frequency of upstream TATA boxes, and an increase of GC3 from 5Ⲡto 3â˛. Moreover, GC3-rich genes are predominant in certain gene classes and are enriched in CpG dinucleotides that are potential targets for methylation. Based on the GC3 bimodal distribution we hypothesize that GC3 has a regulatory role involving methylation and gene expression. To test that hypothesis, we selected diverse taxa (rice, thale cress, bee, and human) that varied in the modality of their GC3 distribution and tested the association between GC3, DNA methylation, and gene expression. We examine the relationship between cytosine methylation levels and GC3, gene expression, genome signature, gene length, and other gene compositional features. We find a strong negative correlation (Pearsonâs correlation coefficient r = â0.67, P value < 0.0001) between GC3 and genic CpG methylation. The comparison between 5â˛-3Ⲡgradients of CG3-skew and genic methylation for the taxa in the study suggests interplay between gene-body methylation and transcription-coupled cytosine deamination effect. Compositional features are correlated with methylation levels of genes in rice, thale cress, human, bee, and fruit fly (which acts as an unmethylated control). These patterns allow us to generate evolutionary hypotheses about the relationships between GC3 and methylation and how these affect expression patterns. Specifically, we propose that the opposite effects of methylation and compositional gradients along coding regions of GC3-poor and GC3-rich genes are the products of several competing processes
Identifying compositionally homogeneous and nonhomogeneous domains within the human genome using a novel segmentation algorithm
It has been suggested that the mammalian genome is composed mainly of long compositionally homogeneous domains. Such domains are frequently identified using recursive segmentation algorithms based on the JensenâShannon divergence. However, a common difficulty with such methods is deciding when to halt the recursive partitioning and what criteria to use in deciding whether a detected boundary between two segments is real or not. We demonstrate that commonly used halting criteria are intrinsically biased, and propose IsoPlotter, a parameter-free segmentation algorithm that overcomes such biases by using a simple dynamic halting criterion and tests the homogeneity of the inferred domains. IsoPlotter was compared with an alternative segmentation algorithm, DJS, using two sets of simulated genomic sequences. Our results show that IsoPlotter was able to infer both long and short compositionally homogeneous domains with low GC content dispersion, whereas DJS failed to identify short compositionally homogeneous domains and sequences with low compositional dispersion. By segmenting the human genome with IsoPlotter, we found that one-third of the genome is composed of compositionally nonhomogeneous domains and the remaining is a mixture of many short compositionally homogeneous domains and relatively few long ones
The Missing Link of Jewish European Ancestry: Contrasting the Rhineland and the Khazarian Hypotheses
The question of Jewish ancestry has been the subject of controversy for over
two centuries and has yet to be resolved. The "Rhineland Hypothesis" proposes
that Eastern European Jews emerged from a small group of German Jews who
migrated eastward and expanded rapidly. Alternatively, the "Khazarian
Hypothesis" suggests that Eastern European descended from Judean tribes who
joined the Khazars, an amalgam of Turkic clans that settled the Caucasus in the
early centuries CE and converted to Judaism in the 8th century. The Judaized
Empire was continuously reinforced with Mesopotamian and Greco-Roman Jews until
the 13th century. Following the collapse of their empire, the Judeo-Khazars
fled to Eastern Europe. The rise of European Jewry is therefore explained by
the contribution of the Judeo-Khazars. Thus far, however, their contribution
has been estimated only empirically; the absence of genome-wide data from
Caucasus populations precluded testing the Khazarian Hypothesis. Recent
sequencing of modern Caucasus populations prompted us to revisit the Khazarian
Hypothesis and compare it with the Rhineland Hypothesis. We applied a wide
range of population genetic analyses - including principal component,
biogeographical origin, admixture, identity by descent, allele sharing
distance, and uniparental analyses - to compare these two hypotheses. Our
findings support the Khazarian Hypothesis and portray the European Jewish
genome as a mosaic of Caucasus, European, and Semitic ancestries, thereby
consolidating previous contradictory reports of Jewish ancestry.Comment: 21 pages, 7 figures, 1 table, 7 supplementary figures, 7
supplementary table
The GenoChip: A New Tool for Genetic Anthropology
The Genographic Project is an international effort aimed at charting human migratory history. The project is nonprofit and nonmedical, and, through its Legacy Fund, supports locally led efforts to preserve indigenous and traditional cultures. Although the first phase of the project was focused on uniparentally inherited markers on the Y-chromosome and mitochondrial DNA (mtDNA), the current phase focuses on markers from across the entire genome to obtain a more complete understanding of human genetic variation. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism (SNP) genotyping, they were designed for medical genetic studies and contain medically related markers that are inappropriate for global population genetic studies. GenoChip, the Genographic Projectâs new genotyping array, was designed to resolve these issues and enable higher resolution research into outstanding questions in genetic anthropology. The GenoChip includes ancestry informative markers obtained for over 450 human populations, an ancient human (Saqqaq), and two archaic hominins (Neanderthal and Denisovan) and was designed to identify all known Y-chromosome and mtDNA haplogroups. The chip was carefully vetted to avoid inclusion of medically relevant markers. To demonstrate its capabilities, we compared the FST distributions of GenoChip SNPs to those of two commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, the GenoChip autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. The chip performances are illustrated in a principal component analysis for 14 worldwide populations. In summary, the GenoChip is a dedicated genotyping platform for genetic anthropology. With an unprecedented number of approximately 12,000 Y-chromosomal and approximately 3,300 mtDNA SNPs and over 130,000 autosomal and X-chromosomal SNPs without any known health, medical, or phenotypic relevance, the GenoChip is a useful tool for genetic anthropology and population genetics
- âŚ