24 research outputs found
Cardiac fibrosis in aging mice
Dystrophic cardiac calcinosis (DCC), also called epicardial and myocardial fibrosis and mineralization, has been detected in mice of a number of laboratory inbred strains, most commonly C3H/HeJ and DBA/2J. In previous mouse breeding studies between these DCC susceptible and the DCC-resistant strain C57BL/6J, 4 genetic loci harboring genes involved in DCC inheritance were identified and subsequently termed Dyscalc loci 1 through 4. Here, we report susceptibility to cardiac fibrosis, a sub-phenotype of DCC, at 12 and 20Â months of age and close to natural death in a survey of 28 inbred mouse strains. Eight strains showed cardiac fibrosis with highest frequency and severity in the moribund mice. Using genotype and phenotype information of the 28 investigated strains, we performed genome-wide association studies (GWAS) and identified the most significant associations on chromosome (Chr) 15 at 72 million base pairs (Mb) (PÂ <Â 10(-13)) and Chr 4 at 122Â Mb (PÂ <Â 10(-11)) and 134Â Mb (PÂ <Â 10(-7)). At the Chr 15 locus, Col22a1 and Kcnk9 were identified. Both have been reported to be morphologically and functionally important in the heart muscle. The strongest Chr 4 associations were located approximately 6Â Mb away from the Dyscalc 2 quantitative trait locus peak within the boundaries of the Extl1 gene and in close proximity to the Trim63 and Cap1 genes. In addition, a single-nucleotide polymorphism association was found on chromosome 11. This study provides evidence for more than the previously reported 4 genetic loci determining cardiac fibrosis and DCC. The study also highlights the power of GWAS in the mouse for dissecting complex genetic traits.The authors thank Jesse Hammer and Josiah Raddar for technical assistance. Research reported in this publication was supported by the Ellison Medical Foundation, Parker B. Francis Foundation, and the National Institutes of Health (R01AR055225 and K01AR064766). Mouse colonies were supported by the National Institutes of Health under Award Number AG025707 for the Jackson Aging Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The Jackson Laboratory Shared Scientific Services were supported in part by a Basic Cancer Center Core Grant from the National Cancer Institute (CA34196).This is the author accepted manuscript. The final version is available from Springer via http://dx.doi.org/10.1007/s00335-016-9634-
Recommended from our members
Management, Integration, and Mining of Tumor Data
Genomics is expected to soon overtake astronomy, particle physics, and even YouTube as the biggest creator of digital information (1). Analysis of this information has already led to important and ground breaking discoveries relevant to our health, but ongoing work will require creative solutions to the multitude of challenges arising from this volume of data. Practically speaking, one such challenge comes from determining what data should be collected and how it is to be managed. As cohort sizes in population based studies grow into the hundreds of thousands, practical issues about collection, storage, and filtering have begun to come more into focus. Additionally, frameworks that seamlessly integrate disparate datasets and also allow for flexible analysis will be required. Finally, as technical challenges and limitations arise, new analytical approaches and designs will have to be considered. This dissertation work was comprised of three projects relating to these questions as approached from the perspective of a bioinformatician. These projects describe the development of new software and methods for sample management, data integration and analysis, and design strategies to improve signal in noisy data. The first chapter of this dissertation consists of background material relating to the projects, including a description about the state of prostate cancer genomics, the development of biomarkers for its detection, and an exploration of a promising new biomarker, cell-free DNA (cfDNA). It also includes a discussion about some of the overarching questions of my PhD. The second chapter describes a web based sample management system, called Samasy. Born out of necessity, this tool addresses a very practical issue of sample subsetting that is often required of resequencing studies. Samasy was used to facilitate the selection of 16,600 samples from a much larger cohort of 54,000 while preserving ethnicity and age balance among cases and controls. This tool integrates with liquid handling systems and provides a visually intuitive interface for plate/sample management and batch sample transfer execution. The third chapter details Orchid, a framework designed to make machine learning of cancer variant data easy and extendible. It does so by integrating a variety of biological annotations (or features) and simple somatic tumor data available from large repositories like the The Cancer Genome Atlas (TCGA) or the International Cancer Genome Consortium (ICGC). This tool supports an efficient data store, MemSQL, that allows for very fast retrieval and filtering, and extends the popular python pandas and scikit-learn packages to facilitate machine learning of this data. Finally, the fourth chapter outlines the creation of a custom targeted sequencing panel for prostate cancer that was designed for screening tumor variants in cfDNA. Building upon the power of Orchid, we detail how machine learning on whole genome prostate tumor datasets can be used to rank mutations by likelihood of being found in a patient with few mutations, or in other words, involved in early state disease. This ranking was used to build a targeted sequencing panel for detection of tumor-derived cfDNA variants. This panel was then validated and applied to a cohort of nine UCSF prostate cancer patients with multiple tumor foci that were collected at time of Radical Prostatectomy (RP). Taken together, the information described in this dissertation provides tools and methodologies for the analysis of germline and somatic variants in prostate and other cancers. It also attempts to further technological development of cfDNA as biomarker for the detection or monitoring of diseases like cancer
Recommended from our members
Management, Integration, and Mining of Tumor Data
Genomics is expected to soon overtake astronomy, particle physics, and even YouTube as the biggest creator of digital information (1). Analysis of this information has already led to important and ground breaking discoveries relevant to our health, but ongoing work will require creative solutions to the multitude of challenges arising from this volume of data. Practically speaking, one such challenge comes from determining what data should be collected and how it is to be managed. As cohort sizes in population based studies grow into the hundreds of thousands, practical issues about collection, storage, and filtering have begun to come more into focus. Additionally, frameworks that seamlessly integrate disparate datasets and also allow for flexible analysis will be required. Finally, as technical challenges and limitations arise, new analytical approaches and designs will have to be considered. This dissertation work was comprised of three projects relating to these questions as approached from the perspective of a bioinformatician. These projects describe the development of new software and methods for sample management, data integration and analysis, and design strategies to improve signal in noisy data. The first chapter of this dissertation consists of background material relating to the projects, including a description about the state of prostate cancer genomics, the development of biomarkers for its detection, and an exploration of a promising new biomarker, cell-free DNA (cfDNA). It also includes a discussion about some of the overarching questions of my PhD. The second chapter describes a web based sample management system, called Samasy. Born out of necessity, this tool addresses a very practical issue of sample subsetting that is often required of resequencing studies. Samasy was used to facilitate the selection of 16,600 samples from a much larger cohort of 54,000 while preserving ethnicity and age balance among cases and controls. This tool integrates with liquid handling systems and provides a visually intuitive interface for plate/sample management and batch sample transfer execution. The third chapter details Orchid, a framework designed to make machine learning of cancer variant data easy and extendible. It does so by integrating a variety of biological annotations (or features) and simple somatic tumor data available from large repositories like the The Cancer Genome Atlas (TCGA) or the International Cancer Genome Consortium (ICGC). This tool supports an efficient data store, MemSQL, that allows for very fast retrieval and filtering, and extends the popular python pandas and scikit-learn packages to facilitate machine learning of this data. Finally, the fourth chapter outlines the creation of a custom targeted sequencing panel for prostate cancer that was designed for screening tumor variants in cfDNA. Building upon the power of Orchid, we detail how machine learning on whole genome prostate tumor datasets can be used to rank mutations by likelihood of being found in a patient with few mutations, or in other words, involved in early state disease. This ranking was used to build a targeted sequencing panel for detection of tumor-derived cfDNA variants. This panel was then validated and applied to a cohort of nine UCSF prostate cancer patients with multiple tumor foci that were collected at time of Radical Prostatectomy (RP). Taken together, the information described in this dissertation provides tools and methodologies for the analysis of germline and somatic variants in prostate and other cancers. It also attempts to further technological development of cfDNA as biomarker for the detection or monitoring of diseases like cancer
Samasy: an automated system for sample selection and robotic transfer
Sample automation and management is increasingly important as the number and size of population-scale and high-throughput projects grow. This is particularly the case in large-scale population studies where sample size is far outpacing the commonly used 96-well plate format. To facilitate management and transfer of samples in this format, we present Samasy, a web-based application for the construction of a sample database, intuitive display of sample and batch information, and facilitation of automated sample transfer or subset. Samasy is designed with ease-of-use in mind, can be quickly set up, and runs in any web browser
Early gene expression differences in inbred mouse strains with susceptibility to pulmonary adenomas.
Lung cancer is the most common cause of cancer-related deaths in both men and women, and effective preventatives are rare due to the difficulty of early detection. Specific gene expression signatures have been identified in individuals that already developed lung cancer. To identify if gene expression differences could be detected in individuals before the onset of the disease, we obtained lung tissues for microarray analysis from young, healthy mice of 9 inbred strains with known differences in their susceptibility to spontaneous pulmonary adenomas when aged. We found that the most common differentially expressed genes among all possible 36 strain comparisons showed significant associations with cancer- and inflammation-related processes. Significant expression differences between susceptible and resistant strains were detected for Aldh3a1, Cxcr1 and 7, Dpt, and Nptx1-genes with known cancer-related functions, and Cd209, Cxcr1 and 7, and Plag2g1b-genes with known inflammatory-related functions. Whereas Aldh3a1, Cd209, Dpt, and Pla2g1b had increased expression, Cxcr1 and 7, and Nptx1 had decreased expression in strains susceptible to pulmonary adenomas. Thus, our study shows that expression differences between susceptible and resistant strains can be detected in young and healthy mice without manifestation of pulmonary adenomas and, thus, may provide an opportunity of early detection. Finally, the identified genes have previously been reported for human non-small cell lung cancer suggesting that molecular pathways may be shared between these two cancer types
A major X-linked locus affects kidney function in mice.
Chronic kidney disease is a common disease with increasing prevalence in the western population. One common reason for chronic kidney failure is diabetic nephropathy. Diabetic nephropathy and hyperglycemia are characteristics of the mouse inbred strain KK/HlJ, which is predominantly used as a model for metabolic syndrome due to its inherited glucose intolerance and insulin resistance. We used KK/HlJ, an albuminuria-sensitive strain, and C57BL/6J, an albuminuria-resistant strain, to perform a quantitative trait locus (QTL) cross to identify the genetic basis for chronic kidney failure. Albumin-creatinine ratio (ACR) was measured in 130 F2 male offspring. One significant QTL was identified on chromosome (Chr) X and four suggestive QTL were found on Chrs 6, 7, 12, and 13. Narrowing of the QTL region was focused on the X-linked QTL and performed by incorporating genotype and expression analyses for genes located in the region. From the 485 genes identified in the X-linked QTL region, a few candidate genes were identified using a combination of bioinformatic evidence based on genomic comparison of the parental strains and known function in urine homeostasis. Finally, this study demonstrates the significance of the X chromosome in the genetic determination of albuminuria
Recommended from our members
A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer.
BackgroundCell-free DNA's (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings.MethodsWhole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel's ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy.ResultsThe panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci.ConclusionMachine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients