Search CORE

3,646 research outputs found

A Differentiation-Based Phylogeny of Cancer Subtypes

Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Genealogy Reconstruction: Methods and applications in cancer and wild populations

Author: Riester Markus
Publication venue
Publication date: 23/06/2010
Field of study

Genealogy reconstruction is widely used in biology when relationships among entities are studied. Phylogenies, or evolutionary trees, show the differences between species. They are of profound importance because they help to obtain better understandings of evolutionary processes. Pedigrees, or family trees, on the other hand visualize the relatedness between individuals in a population. The reconstruction of pedigrees and the inference of parentage in general is now a cornerstone in molecular ecology. Applications include the direct infer- ence of gene flow, estimation of the effective population size and parameters describing the population’s mating behaviour such as rates of inbreeding. In the first part of this thesis, we construct genealogies of various types of cancer. Histopatho- logical classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. We introduce a novel algorithm to rank tumor subtypes according to the dis- similarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia and liposarcoma subtypes and then apply it to a broader group of sarcomas and of breast cancer subtypes. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors. In contrast to asexually reproducing cancer cell populations, pedigrees of sexually reproduc- ing populations cannot be represented by phylogenetic trees. Pedigrees are directed acyclic graphs (DAGs) and therefore resemble more phylogenetic networks where reticulate events are indicated by vertices with two incoming arcs. We present a software package for pedigree reconstruction in natural populations using co-dominant genomic markers such as microsatel- lites and single nucleotide polymorphism (SNPs) in the second part of the thesis. If available, the algorithm makes use of prior information such as known relationships (sub-pedigrees) or the age and sex of individuals. Statistical confidence is estimated by Markov chain Monte Carlo (MCMC) sampling. The accuracy of the algorithm is demonstrated for simulated data as well as an empirical data set with known pedigree. The parentage inference is robust even in the presence of genotyping errors. We further demonstrate the accuracy of the algorithm on simulated clonal populations. We show that the joint estimation of parameters of inter- est such as the rate of self-fertilization or clonality is possible with high accuracy even with marker panels of moderate power. Classical methods can only assign a very limited number of statistically significant parentages in this case and would therefore fail. The method is implemented in a fast and easy to use open source software that scales to large datasets with many thousand individuals.:Abstract v Acknowledgments vii 1 Introduction 1 2 Cancer Phylogenies 7 2.1 Introduction..................................... 7 2.2 Background..................................... 9 2.2.1 PhylogeneticTrees............................. 9 2.2.2 Microarrays................................. 10 2.3 Methods....................................... 11 2.3.1 Datasetcompilation ............................ 11 2.3.2 Statistical Methods and Analysis..................... 13 2.3.3 Comparison of our methodology to other methods . . . . . . . . . . . 15 2.4 Results........................................ 16 2.4.1 Phylogenetic tree reconstruction method. . . . . . . . . . . . . . . . . 16 2.4.2 Comparison of tree reconstruction methods to other algorithms . . . . 28 2.4.3 Systematic analysis of methods and parameters . . . . . . . . . . . . . 30 2.5 Discussion...................................... 32 3 Wild Pedigrees 35 3.1 Introduction..................................... 35 3.2 The molecular ecologist’s tools of the trade ................... 36 3.2.1 3.2.2 3.2.3 3.2.1 Sibship inference and parental reconstruction . . . . . . . . . . . . . . 37 3.2.2 Parentage and paternity inference .................... 39 3.2.3 Multigenerational pedigree reconstruction . . . . . . . . . . . . . . . . 40 3.3 Background..................................... 40 3.3.1 Pedigrees .................................. 40 3.3.2 Genotypes.................................. 41 3.3.3 Mendelian segregation probability .................... 41 3.3.4 LOD Scores................................. 43 3.3.5 Genotyping Errors ............................. 43 3.3.6 IBD coefficients............................... 45 3.3.7 Bayesian MCMC.............................. 46 3.4 Methods....................................... 47 3.4.1 Likelihood Model.............................. 47 3.4.2 Efficient Likelihood Calculation...................... 49 3.4.3 Maximum Likelihood Pedigree ...................... 51 3.4.4 Full siblings................................. 52 3.4.5 Algorithm.................................. 53 3.4.6 Missing Values ............................... 56 3.4.7 Allelefrequencies.............................. 58 3.4.8 Rates of Self-fertilization.......................... 60 3.4.9 Rates of Clonality ............................. 60 3.5 Results........................................ 61 3.5.1 Real Microsatellite Data.......................... 61 3.5.2 Simulated Human Population....................... 62 3.5.3 SimulatedClonalPlantPopulation.................... 64 3.6 Discussion...................................... 71 4 Conclusions 77 A FRANz 79 A.1 Availability ..................................... 79 A.2 Input files...................................... 79 A.2.1 Maininputfile ............................... 79 A.2.2 Knownrelationships ............................ 80 A.2.3 Allele frequencies.............................. 81 A.2.4 Sampling locations............................. 82 A.3 Output files..................................... 83 A.4 Web 2.0 Interface.................................. 86 List of Figures 87 List of Tables 88 List Abbreviations 90 Bibliography 92 Curriculum Vitae

Qucosa - Publikationsserver der Universität Leipzig

Medoidshift clustering applied to genomic bulk tumor data.

Author: Roman Theodore
Schwartz Russell
Xie Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/01/2016
Field of study

Despite the enormous medical impact of cancers and intensive study of their biology, detailed characterization of tumor growth and development remains elusive. This difficulty occurs in large part because of enormous heterogeneity in the molecular mechanisms of cancer progression, both tumor-to-tumor and cell-to-cell in single tumors. Advances in genomic technologies, especially at the single-cell level, are improving the situation, but these approaches are held back by limitations of the biotechnologies for gathering genomic data from heterogeneous cell populations and the computational methods for making sense of those data. One popular way to gain the advantages of whole-genome methods without the cost of single-cell genomics has been the use of computational deconvolution (unmixing) methods to reconstruct clonal heterogeneity from bulk genomic data. These methods, too, are limited by the difficulty of inferring genomic profiles of rare or subtly varying clonal subpopulations from bulk data, a problem that can be computationally reduced to that of reconstructing the geometry of point clouds of tumor samples in a genome space. Here, we present a new method to improve that reconstruction by better identifying subspaces corresponding to tumors produced from mixtures of distinct combinations of clonal subpopulations. We develop a nonparametric clustering method based on medoidshift clustering for identifying subgroups of tumors expected to correspond to distinct trajectories of evolutionary progression. We show on synthetic and real tumor copy-number data that this new method substantially improves our ability to resolve discrete tumor subgroups, a key step in the process of accurately deconvolving tumor genomic data and inferring clonal heterogeneity from bulk data

Springer - Publisher Connector

PubMed Central

D-Scholarship@Pitt

Recommended from our members

Spatial intratumoral heterogeneity and temporal clonal evolution in esophageal squamous cell carcinoma.

Author: Berman Benjamin P
Cai Yan
Chang Chen
Dinh Huy Q
Hao Jia-Jie
Jiang Yan-Yi
Jiang Ye
Koeffler H Phillip
Lin De-Chen
Lu Chen-Chen
Mayakonda Anand
Shi Zhi-Zhou
Wang Jin-Wu
Wang Ming-Rong
Wei Wen-Qiang
Xu Xin
Zhan Qi-Min
Zhang Yu
Publication venue: eScholarship, University of California
Publication date: 01/12/2016
Field of study

Esophageal squamous cell carcinoma (ESCC) is among the most common malignancies, but little is known about its spatial intratumoral heterogeneity (ITH) and temporal clonal evolutionary processes. To address this, we performed multiregion whole-exome sequencing on 51 tumor regions from 13 ESCC cases and multiregion global methylation profiling for 3 of these 13 cases. We found an average of 35.8% heterogeneous somatic mutations with strong evidence of ITH. Half of the driver mutations located on the branches of tumor phylogenetic trees targeted oncogenes, including PIK3CA, NFE2L2 and MTOR, among others. By contrast, the majority of truncal and clonal driver mutations occurred in tumor-suppressor genes, including TP53, KMT2D and ZNF750, among others. Interestingly, phyloepigenetic trees robustly recapitulated the topological structures of the phylogenetic trees, indicating a possible relationship between genetic and epigenetic alterations. Our integrated investigations of spatial ITH and clonal evolution provide an important molecular foundation for enhanced understanding of tumorigenesis and progression in ESCC

eScholarship - University of California

A unified phylogeny-based nomenclature for histone variants

Histone variants are non-allelic protein isoforms that play key roles in diversifying chromatin structure. The known number of such variants has greatly increased in recent years, but the lack of naming conventions for them has led to a variety of naming styles, multiple synonyms and misleading homographs that obscure variant relationships and complicate database searches. We propose here a unified nomenclature for variants of all five classes of histones that uses consistent but flexible naming conventions to produce names that are informative and readily searchable. The nomenclature builds on historical usage and incorporates phylogenetic relationships, which are strong predictors of structure and function. A key feature is the consistent use of punctuation to represent phylogenetic divergence, making explicit the relationships among variant subtypes that have previously been implicit or unclear. We recommend that by default new histone variants be named with organism-specific paralog-number suffixes that lack phylogenetic implication, while letter suffixes be reserved for structurally distinct clades of variants. For clarity and searchability, we encourage the use of descriptors that are separate from the phylogeny-based variant name to indicate developmental and other properties of variants that may be independent of structure

Hal - Université Grenoble Alpes

Directory of Open Access Journals

Open Access LMU

Carolina Digital Repository

The Australian National University

NORA - Norwegian Open Research Archives

Hal-Diderot

University of Bergen

University of Missouri: MOspace

Crossref

Harvard University - DASH

Springer - Publisher Connector

HAL-Inserm

PubMed Central

eScholarship - University of California

Oxford University Research Archive

univOAK

University of Melbourne Institutional Repository

ScholarBank@NUS

The Evolutionary Dynamics of the Lion Panthera leo Revealed by Host and Viral Population Genomics

Author: A Antunes
AD Barnosky
Agostinho Antunes
AR Rogers
AR Rogers
AR Templeton
AR Templeton
AR Templeton
AR Templeton
AR Templeton
AR Templeton
Arnaud Estoup
C Packer
C Packer
C Packer
C Pitra
CA Driscoll
Christiaan Winterbach
Craig Packer
D Falush
D Posada
D Posada
David Wildt
DL Swofford
DW Hutchison
E Minch
EW Brown
EW Brown
F Rousset
F Rousset
G Petter
G Petter
G Spong
Graham Hemson
Gus Mills
H Bauer
H Hemmer
H Whitehead
Hanlie Winterbach
HC Harpending
J Burger
J Dubach
J Felsenstein
J Goudet
J Pecon-Slattery
J Rozas
Jennifer L. Troyer
JH Kim
Jill Pecon-Slattery
JK Pritchard
JL Troyer
JL Troyer
JM Cornuet
JP Benzécri
K Belkir
KA Crandall
Katherine C. Prager
Kathy A. Alexander
L Excoffier
L Werdelin
L Werdelin
Laurence Frank
LG Nersting
LL Cavalli-Sforza
LL Knowles
Ludwig Siefert
M Clement
M Menotti-Raymond
M Menotti-Raymond
M Panchal
M Panchal
M Raymond
MA Beaumont
MA Carpenter
Margaret Driciru
ME Roelke-Parker
Melody E. Roelke
Mitch Bush
N Mantel
N Neff
N Takahata
N Takezaki
P Arctander
Paul J. Funston
Philip Stander
R Barnett
R Biek
R Leblois
R Leblois
RM Nowak
S Nyakaana
S Piry
S Schneider
S VandeWoude
SJ Luo
SJ O'Brien
Stephen J. O'Brien
T Partridge
TH van Andel
V King
Warren E. Johnson
WE Johnson
Y-X Fu
Y-X Fu
Publication venue: Public Library of Science
Publication date: 01/11/2008
Field of study

The lion Panthera leo is one of the world's most charismatic carnivores and is one of Africa's key predators. Here, we used a large dataset from 357 lions comprehending 1.13 megabases of sequence data and genotypes from 22 microsatellite loci to characterize its recent evolutionary history. Patterns of molecular genetic variation in multiple maternal (mtDNA), paternal (Y-chromosome), and biparental nuclear (nDNA) genetic markers were compared with patterns of sequence and subtype variation of the lion feline immunodeficiency virus (FIVPle), a lentivirus analogous to human immunodeficiency virus (HIV). In spite of the ability of lions to disperse long distances, patterns of lion genetic diversity suggest substantial population subdivision (mtDNA ΦST = 0.92; nDNA FST = 0.18), and reduced gene flow, which, along with large differences in sero-prevalence of six distinct FIVPle subtypes among lion populations, refute the hypothesis that African lions consist of a single panmictic population. Our results suggest that extant lion populations derive from several Pleistocene refugia in East and Southern Africa (∼324,000–169,000 years ago), which expanded during the Late Pleistocene (∼100,000 years ago) into Central and North Africa and into Asia. During the Pleistocene/Holocene transition (∼14,000–7,000 years), another expansion occurred from southern refugia northwards towards East Africa, causing population interbreeding. In particular, lion and FIVPle variation affirms that the large, well-studied lion population occupying the greater Serengeti Ecosystem is derived from three distinct populations that admixed recently

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

NSU Works

Reconstruction of ancestral brains: Exploring the evolutionary process of encephalization in amniotes

Author: Gotoh Hitoshi
Murakami Yasunori
Nomura Tadashi
Ono Katsuhiko
Publication venue: The Authors. Published by Elsevier Ireland Ltd.
Publication date: 30/09/2014
Field of study

AbstractThere is huge divergence in the size and complexity of vertebrate brains. Notably, mammals and birds have bigger brains than other vertebrates, largely because these animal groups established larger dorsal telencephali. Fossil evidence suggests that this anatomical trait could have evolved independently. However, recent comparative developmental analyses demonstrate surprising commonalities in neuronal subtypes among species, although this interpretation is highly controversial. In this review, we introduce intriguing evidence regarding brain evolution collected from recent studies in paleontology and developmental biology, and we discuss possible evolutionary changes in the cortical developmental programs that led to the encephalization and structural complexity of amniote brains. New research concepts and approaches will shed light on the origin and evolutionary processes of amniote brains, particularly the mammalian cerebral cortex

Elsevier - Publisher Connector

The GDR : a novel approach to detect large-scale genomic sequence patterns

Author: Bull Nora Borge
Publication venue: Norwegian University of Life Sciences, Ås
Publication date: 01/01/2021
Field of study

Utvikling av ny sekvenseringsteknologi de to siste tiårene har tillatt dypere dykk ned i de biomolekylære aspektene ved menneskets oppskrift. Hel-genom data fra flere hundre tusen mennesker er allerede tilgjengelig, men hvordan den økende mengden informasjon kan settes sammen til meningsfull funksjonell tolkning er komplisert og krever nye metoder. MikroRNA - mRNA interaksjoner utgjør et enormt genreguleringsnettverk som er vanskelig å predikere, selv for dagens beste maskinlæringsalgoritmer(1). Disse ikke-kodende elementene er involvert i omtrent alle cellulære prosesser i mennesket, primært via delvis komplementær baseparing mellom mikroRNA og mRNA, men det er mye vi ikke forstår av dette nettverkets betydning i vår biologi (2-4). Nye metoder er nødvendige for å kunne utforske genetisk variasjon i dette nettverket, som kan gi nye innblikk i hvordan genene våre reguleres. Her presenteres «The Group Diversity Ratio» (GDR) som en ny målenhet til å møte denne utfordringen. GDR kan kvantifisere evolusjonær struktur av variasjon i store mengder genomisk sekvensdata, med et resultat som kan statistisk valideres. Metoden baserer seg på å måle gruppe-struktur i et distanse-basert fylogenetisk tre av sekvensdata, for forhåndsdefinerte grupper av «blader» i treet. Gruppene representerer en egenskap som kan relateres til sekvensdataen, og det undersøkes til hvilken grad det finnes en sammenheng mellom de to. Metoden kan primært brukes til å raskt skaffe overblikk over store mengder genomisk sekvensdata, som kan gi verdifulle innblikk til videre etterforskning. For å teste metoden ble GDR brukt til å identifisere variasjon assosiert med etniske populasjoner i 3’UTR data fra «The 1000 Genomes Project» (1KGP). 1KGP var det første store prosjektet som adresserte den etniske skjevheten som nå finnes i genom-databaser, og som utgjør en god grunn til å utforske etnisk genetisk variasjon (5). I tillegg til identifikasjon av mer enn 1000 3’UTR sekvenser som inneholder signifikant etnisitet-spesifikk variasjon, viser dette studiet GDR-metodens høye potensial til å undersøke genetisk variasjon i stor skala.The emergence of new sequencing technologies over the past two decades has enabled us to dive deeper into the biomolecular aspect of the human recipe. Entire genomes from several hundred thousand people are already accessible, but how to interpretate the connections between the blueprints and the phenotypes are complicated, even for the best developed machine learning algorithms. Prediction of the microRNA-mRNA targeting network is a classic example, which is involved with gene regulation of all living cell processes. These non-coding features make up complex networks of interactions, where microRNAs primarily target 3’UTRs through partial complementary base-pairing. Thus, the challenge to investigate patterns in such large-scaled genomic sequence data requires new approaches. The Group Diversity Ratio (GDR) metric is presented here as a novel approach to aid in this challenge. The GDR quantifies genome-wide structure in large-scale sequence data with a statistically testable result. Patterns are measured for a group feature that may be related to variation in sequence samples, based on phylogenetic distance estimations. It opens opportunities to quickly gain insights into genomic regions of interests and used to guide further research. To demonstrate the use of the GDR metric, ethnicity-associated variation patterns in more than 1000 human 3’UTRs was identified with the GDR. The study set was from 1000 Genomes project, which was the first major effort to address the problem of ethnic bias in genetic studies and contained more than 2500 whole-genome sequences from 26 ethnic lineages. In addition to detecting significantly distinct 3’UTR elements for ethnic populations, the key finding of this study was the high potentials of the GDR to facilitate more high-throughput characterization of genomic sequence data.M-BIA

Brage NMBU