109 research outputs found
Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences
During the past decade there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. Currently there are 43 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next five years. Several groups of researchers including ours have been developing new techniques for gathering and analyzing entire plastid genome sequences and details of these developments are summarized in this chapter. The most important recent developments that enhance our ability to generate whole chloroplast genome sequences involve the generation of pure fractions of chloroplast genomes by whole genome amplification using rolling circular amplification, cloning genomes into Fosmid or BAC vectors, and the development of an organellar annotation program (DOGMA). In addition to providing details of these methods, we provide an overview of methods for analyzing complete plastid genome sequences for repeats and gene content, as well as approaches for using gene order and sequence data for phylogeny reconstruction. This explosive increase in the number of sequenced plastid genomes and improved computational tools will provide many insights into the evolution of these genomes and much new data for assessing relationships at deep nodes in plants and other photosynthetic organisms
The Formation and Evolution of Virgo Cluster Galaxies - I. Broadband Optical & Infrared Colours
We use a combination of deep optical (gri) and near-infrared (H) photometry
to study the radially-resolved colours of a broad sample of 300 Virgo cluster
galaxies. For most galaxy types, we find that the median g-H colour gradient is
either flat (gas-poor giants and gas-rich dwarfs) or negative (i.e., colours
become bluer with increasing radius; gas-poor dwarfs, spirals, and gas-poor
peculiars). Later-type galaxies typically exhibit more negative gradients than
early-types. Given the lack of a correlation between the central colours and
axis ratios of Virgo spiral galaxies, we argue that dust likely plays a small
role, if at all, in setting those colour gradients. We search for possible
correlations between galaxy colour and photometric structure or environment and
find that the Virgo galaxy colours become redder with increasing concentration,
luminosity and surface brightness, while no dependence with cluster-centric
radius or local galaxy density is detected (over a range of ~2 Mpc and ~3-16
Mpc^-2, respectively). However, the colours of gas-rich Virgo galaxies do
correlate with neutral gas deficiency, such that these galaxies become redder
with higher deficiencies. Comparisons with stellar population models suggest
that these colour gradients arise principally from variations in stellar
metallicity within these galaxies, while age variations only make a significant
contribution to the colour gradients of Virgo irregulars. A detailed stellar
population analysis based on this material is presented in Roediger et al
(2011b; arXiv:1011.3511).Comment: 34 pages, 12 figures, 1 table, submitted to MNRAS; Paper II
(arXiv:1011.3511) has also been update
Analysis of 81 Genes From 64 Plastid Genomes Resolves Relationships in Angiosperms and Identifies Genome-Scale Evolutionary Patterns
Angiosperms are the largest and most successful clade of land plants with \u3e250,000 species distributed in nearly every terrestrial habitat. Many phylogenetic studies have been based on DNA sequences of one to several genes, but, despite decades of intensive efforts, relationships among early diverging lineages and several of the major clades remain either incompletely resolved or weakly supported. We performed phylogenetic analyses of 81 plastid genes in 64 sequenced genomes, including 13 new genomes, to estimate relationships among the major angiosperm clades, and the resulting trees are used to examine the evolution of gene and intron content. Phylogenetic trees from multiple methods, including model-based approaches, provide strong support for the position of Amborella as the earliest diverging lineage of flowering plants, followed by Nymphaeales and Austrobaileyales. The plastid genome trees also provide strong support for a sister relationship between eudicots and monocots, and this group is sister to a clade that includes Chloranthales and magnoliids. Resolution of relationships among the major clades of angiosperms provides the necessary framework for addressing numerous evolutionary questions regarding the rapid diversification of angiosperms. Gene and intron content are highly conserved among the early diverging angiosperms and basal eudicots, but 62 independent gene and intron losses are limited to the more derived monocot and eudicot clades. Moreover, a lineage-specific correlation was detected between rates of nucleotide substitutions, indels, and genomic rearrangements
Respiratory diseases among U.S. military personnel: countering emerging threats.
Emerging respiratory disease agents, increased antibiotic resistance, and the loss of effective vaccines threaten to increase the incidence of respiratory disease in military personnel. We examine six respiratory pathogens (adenoviruses, influenza viruses, Streptococcus pneumoniae, Streptococcus pyogenes, Mycoplasma pneumoniae, and Bordetella pertussis) and review the impact of the diseases they cause, past efforts to control these diseases in U.S. military personnel, as well as current treatment and surveillance strategies, limitations in diagnostic testing, and vaccine needs
Epistasis: Obstacle or Advantage for Mapping Complex Traits?
Identification of genetic loci in complex traits has focused largely on one-dimensional genome scans to search for associations between single markers and the phenotype. There is mounting evidence that locus interactions, or epistasis, are a crucial component of the genetic architecture of biologically relevant traits. However, epistasis is often viewed as a nuisance factor that reduces power for locus detection. Counter to expectations, recent work shows that fitting full models, instead of testing marker main effect and interaction components separately, in exhaustive multi-locus genome scans can have higher power to detect loci when epistasis is present than single-locus scans, and improvement that comes despite a much larger multiple testing alpha-adjustment in such searches. We demonstrate, both theoretically and via simulation, that the expected power to detect loci when fitting full models is often larger when these loci act epistatically than when they act additively. Additionally, we show that the power for single locus detection may be improved in cases of epistasis compared to the additive model. Our exploration of a two step model selection procedure shows that identifying the true model is difficult. However, this difficulty is certainly not exacerbated by the presence of epistasis, on the contrary, in some cases the presence of epistasis can aid in model selection. The impact of allele frequencies on both power and model selection is dramatic
Building prognostic models for breast cancer patients using clinical variables and hundreds of gene expression signatures
<p>Abstract</p> <p>Background</p> <p>Multiple breast cancer gene expression profiles have been developed that appear to provide similar abilities to predict outcome and may outperform clinical-pathologic criteria; however, the extent to which seemingly disparate profiles provide additive prognostic information is not known, nor do we know whether prognostic profiles perform equally across clinically defined breast cancer subtypes. We evaluated whether combining the prognostic powers of standard breast cancer clinical variables with a large set of gene expression signatures could improve on our ability to predict patient outcomes.</p> <p>Methods</p> <p>Using clinical-pathological variables and a collection of 323 gene expression "modules", including 115 previously published signatures, we build multivariate Cox proportional hazards models using a dataset of 550 node-negative systemically untreated breast cancer patients. Models predictive of pathological complete response (pCR) to neoadjuvant chemotherapy were also built using this approach.</p> <p>Results</p> <p>We identified statistically significant prognostic models for relapse-free survival (RFS) at 7 years for the entire population, and for the subgroups of patients with ER-positive, or Luminal tumors. Furthermore, we found that combined models that included both clinical and genomic parameters improved prognostication compared with models with either clinical or genomic variables alone. Finally, we were able to build statistically significant combined models for pathological complete response (pCR) predictions for the entire population.</p> <p>Conclusions</p> <p>Integration of gene expression signatures and clinical-pathological factors is an improved method over either variable type alone. Highly prognostic models could be created when using all patients, and for the subset of patients with lymph node-negative and ER-positive breast cancers. Other variables beyond gene expression and clinical-pathological variables, like gene mutation status or DNA copy number changes, will be needed to build robust prognostic models for ER-negative breast cancer patients. This combined clinical and genomics model approach can also be used to build predictors of therapy responsiveness, and could ultimately be applied to other tumor types.</p
The Impact of Multifunctional Genes on "Guilt by Association" Analysis
Many previous studies have shown that by using variants of “guilt-by-association”, gene function predictions can be made with very high statistical confidence. In these studies, it is assumed that the “associations” in the data (e.g., protein interaction partners) of a gene are necessary in establishing “guilt”. In this paper we show that multifunctionality, rather than association, is a primary driver of gene function prediction. We first show that knowledge of the degree of multifunctionality alone can produce astonishingly strong performance when used as a predictor of gene function. We then demonstrate how multifunctionality is encoded in gene interaction data (such as protein interactions and coexpression networks) and how this can feed forward into gene function prediction algorithms. We find that high-quality gene function predictions can be made using data that possesses no information on which gene interacts with which. By examining a wide range of networks from mouse, human and yeast, as well as multiple prediction methods and evaluation metrics, we provide evidence that this problem is pervasive and does not reflect the failings of any particular algorithm or data type. We propose computational controls that can be used to provide more meaningful control when estimating gene function prediction performance. We suggest that this source of bias due to multifunctionality is important to control for, with widespread implications for the interpretation of genomics studies
- …