Search CORE

38 research outputs found

STRP Screening Sets for the human genome at 5 cM density

Author: A Kong
A Lynn
AM Bowcock
B Brinkmann
B Yuan
C Ober
C Wijmenga
C Zhao
DF Callen
F Calafell
G Tóth
GA Huttley
I Simonic
J. Dubovsky
JL Weber
JL Weber
JL Weber
JL Weber
JM Gastier
JS Beckmann
KW Broman
L Jin
L Kruglyak
M Cullen
MJ Brownstein
NA Rosenberg
P de Kniff
R Chakraborty
R Deka
RA Ophoff
S Giglio
T Varilo
Utah Marker Development Group
VC Sheffield
VL Magnuson
WJ Kent
Publication venue: BioMed Central
Publication date: 01/02/2003
Field of study

BACKGROUND: Short tandem repeat polymorphisms (STRPs) are powerful tools for gene mapping and other applications. A STRP genome scan of 10 cM is usually adequate for mapping single gene disorders. However mapping studies involving genetically complex disorders and especially association (linkage disequilibrium) often require higher STRP density. RESULTS: We report the development of two separate 10 cM human STRP Screening Sets (Sets 12 and 52) which span all chromosomes. When combined, the two Sets contain a total of 782 STRPs, with average STRP spacing of 4.8 cM, average heterozygosity of 0.72, and total sex-average coverage of 3535 cM. The current Sets are comprised almost entirely of STRPs based on tri- and tetranucleotide repeats. We also report correction of primer sequences for many STRPs used in previous Screening Sets. Detailed information for the new Screening Sets is available from our web site: . CONCLUSION: Our new human STRP Screening Sets will improve the quality and cost effectiveness of genotyping for gene mapping and other applications

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A codon substitution model that incorporates the effect of the GC contents, the gene density and the density of CpG islands of human chromosomes

Author: A Varriale
AL Hughes
AP Bird
E Scarano
F Antequera
F Vogel
G Lunter
GA Huttley
J Felsenstein
J Sullivan
J Taylor
JC Walser
JL Leroy
K Katoh
K Misawa
K Misawa
K Misawa
K Misawa
Kazuharu Misawa
KJ Fryxell
KJ Fryxell
M Krawczak
M Nei
MA Larkin
R Development Core Team
R Grantham
RA Gibbs
S Horai
S Kaneko
S Tyekucheva
SF Altschul
T Miyata
TH Jukes
WH Li
Y Suzuki
Z Yang
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Developing a model for codon substitutions is essential for the analyses of protein sequences. Recent studies on the mutation rates in the non-coding regions have shown that CpG mutation rates in the human genome are negatively correlated to the local GC content and to the densities of functional elements. This study aimed at understanding the effect of genomic features, namely, GC content, gene density, and frequency of CpG islands, on the rates of codon substitution in human chromosomes. Results Codon substitution rates of CpG to TpG mutations, TpG to CpG mutations, and non-CpG transitions and transversions in humans were estimated by comparing the coding regions of thousands of human and chimpanzee genes and inferring their ancestral sequences by using macaque genes as the outgroup. Since the genomic features are depending on each other, partial regression coefficients of these features were obtained. Conclusion The substitution rates of codons depend on gene densities of the chromosomes. Transcription-associated mutation is one such pressure. On the basis of these results, a model of codon substitutions that incorporates the effect of genomic features on codon substitution in human chromosomes was developed.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Identifying Selected Regions from Heterozygosity and Divergence Using a Light-Coverage Genomic Dataset from Two Human Populations

Author: A Berry
A Dove
BF Voight
BS Weir
CC Spencer
CCA Spencer
CS Carlson
D Altshuler
Dennis A. Gilbert
DJ Begun
EJ Parra
EJ Vallender
ET Wang
FM De La Vega
FM De la Vega
Francisco M. De La Vega
GA Huttley
J Maynard Smith
JL Kelley
JM Akey
JP Pollinger
KA Frazer
Kai Zhao
KM Teshima
LJ Engle
LL Cavalli-Sforza
LL Cavalli-Sforza
M Bamshad
M Nei
Matthew W. Hahn
MF Taylor
Michael W. Smith
MT Hamblin
MT Hamblin
MW Smith
PC Sabeti
PC Sabeti
PC Sabeti
R Nielsen
RC Lewontin
S Olson
S Wright
SA Karl
SA Tishkoff
SA Tishkoff
SA Tishkoff
Stephen J. O'Brien
T Bersaglieri
Taras K. Oleksyk
X Kong
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

When a selective sweep occurs in the chromosomal region around a target gene in two populations that have recently separated, it produces three dramatic genomic consequences: 1) decreased multi-locus heterozygosity in the region; 2) elevated or diminished genetic divergence (FST) of multiple polymorphic variants adjacent to the selected locus between the divergent populations, due to the alternative fixation of alleles; and 3) a consequent regional increase in the variance of FST (S2FST) for the same clustered variants, due to the increased alternative fixation of alleles in the loci surrounding the selection target. In the first part of our study, to search for potential targets of directional selection, we developed and validated a resampling-based computational approach; we then scanned an array of 31 different-sized moving windows of SNP variants (5–65 SNPs) across the human genome in a set of European and African American population samples with 183,997 SNP loci after correcting for the recombination rate variation. The analysis revealed 180 regions of recent selection with very strong evidence in either population or both. In the second part of our study, we compared the newly discovered putative regions to those sites previously postulated in the literature, using methods based on inspecting patterns of linkage disequilibrium, population divergence and other methodologies. The newly found regions were cross-validated with those found in nine other studies that have searched for selection signals. Our study was replicated especially well in those regions confirmed by three or more studies. These validated regions were independently verified, using a combination of different methods and different databases in other studies, and should include fewer false positives. The main strength of our analysis method compared to others is that it does not require dense genotyping and therefore can be used with data from population-based genome SNP scans from smaller studies of humans or other species

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

NSU Works

The Embedding Problem for Markov Models of Nucleotide Substitution

Author: Anuj Pahwa
B Pakendorf
B Singer
C Sheffield
D Barr
D Penny
G Elfving
GA Doerge
Gavin A. Huttley
GS Goodman
H Frydman
H Frydman
H Lindsay
H Song
HW Schranz
J Geweke
J Sumner
JFC Kingman
JT Chang
JT Runnenberg
K Tamura
K Tamura
Klara L. Verbyla
Konrad Scheffler
L Bofkin
LS Jermiin
M Kallersjo
M Kanehisa
M Oscamou
M Wolf
MJD Powell
N Galtier
N Galtier
N Lartillot
NGC Smith
P Carette
P Carette
PA Goloboff
PG Foster
PG Foster
PJ Lockhart
R Bevan
R Hardison
R Knight
S Johansen
S Johansen
S Johansen
SYW Ho
V Jayaswal
V Jayaswal
Von Bing Yap
WL Goffe
XH Xia
Yunli Shao
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 30/07/2013
Field of study

10.1371/journal.pone.0069187PLoS ONE87-POLN

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Australian National University

ScholarBank@NUS

The Francis Crick Institute

Relationship between amino acid composition and gene expression in the mouse genome

Abstract Background Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. Findings We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. Conclusion These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PyEvolve: a toolkit for statistical modelling of molecular evolution

Author: Butterfield A
Huttley GA
Isaev A
Lang E
Lawrence C
Vedagiri V
Wakefield MJ
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/01/2004
Field of study

BACKGROUND: Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. RESULTS: Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. CONCLUSION: PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software

PubMed Central

University of Melbourne Institutional Repository