145 research outputs found

    Statistical identification of gene association by CID in application of constructing ER regulatory network

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating <it>in silico </it>inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A).</p> <p>Results</p> <p>The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's <it>t</it>-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays.</p> <p>Conclusion</p> <p>CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers.</p> <p>Availability</p> <p>the implementation of CID in R codes can be freely downloaded from <url>http://homepage.ntu.edu.tw/~lyliu/BC/</url>.</p

    Construction and characterization of an expressed sequenced tag library for the mosquito vector Armigeres subalbatus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The mosquito, <it>Armigeres subalbatus</it>, mounts a distinctively robust innate immune response when infected with the nematode <it>Brugia malayi</it>, a causative agent of lymphatic filariasis. In order to mine the transcriptome for new insight into the cascade of events that takes place in response to infection in this mosquito, 6 cDNA libraries were generated from tissues of adult female mosquitoes subjected to immune-response activation treatments that lead to well-characterized responses, and from aging, naïve mosquitoes. Expressed sequence tags (ESTs) from each library were produced, annotated, and subjected to comparative analyses.</p> <p>Results</p> <p>Six libraries were constructed and used to generate 44,940 expressed sequence tags, of which 38,079 passed quality filters to be included in the annotation project and subsequent analyses. All of these sequences were collapsed into clusters resulting in 8,020 unique sequence clusters or singletons. EST clusters were annotated and curated manually within ASAP (A Systematic Annotation Package for Community Analysis of Genomes) web portal according to BLAST results from comparisons to Genbank, and the <it>Anopheles gambiae </it>and <it>Drosophila melanogaster </it>genome projects.</p> <p>Conclusion</p> <p>The resulting dataset is the first of its kind for this mosquito vector and provides a basis for future studies of mosquito vectors regarding the cascade of events that occurs in response to infection, and thereby providing insight into vector competence and innate immunity.</p

    Cardiac Myosin Binding Protein C and MAP-Kinase Activating Death Domain-Containing Gene Polymorphisms and Diastolic Heart Failure

    Get PDF
    OBJECTIVE: Myosin binding protein C (MYBPC3) plays a role in ventricular relaxation. The aim of the study was to investigate the association between cardiac myosin binding protein C (MYBPC3) gene polymorphisms and diastolic heart failure (DHF) in a human case-control study. METHODS: A total of 352 participants of 1752 consecutive patients from the National Taiwan University Hospital and its affiliated hospital were enrolled. 176 patients diagnosed with DHF confirmed by echocardiography were recruited. Controls were matched 1-to-1 by age, sex, hypertension, diabetes, renal function and medication use. We genotyped 12 single nucleotide polymorphisms (SNPs) according to HapMap Han Chinese Beijing databank across a 40 kb genetic region containing the MYBPC3 gene and the neighboring DNA sequences to capture 100% of haplotype variance in all SNPs with minor allele frequencies ≥ 5%. We also analyzed associations of these tagging SNPs and haplotypes with DHF and linkage disequilibrium (LD) structure of the MYBPC3 gene. RESULTS: In a single locus analysis, SNP rs2290149 was associated with DHF (allele-specific p = 0.004; permuted p = 0.031). The SNP with a minor allele frequency of 9.4%, had an odds ratio 2.14 (95% CI 1.25-3.66; p = 0.004) for the additive model and 2.06 for the autosomal dominant model (GG+GA : AA, 95% CI 1.17-3.63; p = 0.013), corresponding to a population attributable risk fraction of 12.02%. The haplotypes in a LD block of rs2290149 (C-C-G-C) was also significantly associated with DHF (odds ratio 2.10 (1.53-2.89); permuted p = 0.029). CONCLUSIONS: We identified a SNP (rs2290149) among the tagging SNP set that was significantly associated with early DHF in a Chinese population

    Serine Protease PRSS23 Is Upregulated by Estrogen Receptor α and Associated with Proliferation of Breast Cancer Cells

    Get PDF
    Serine protease PRSS23 is a newly discovered protein that has been associated with tumor progression in various types of cancers. Interestingly, PRSS23 is coexpressed with estrogen receptor α (ERα), which is a prominent biomarker and therapeutic target for human breast cancer. Estrogen signaling through ERα is also known to affect cell proliferation, apoptosis, and survival, which promotes tumorigenesis by regulating the production of numerous downstream effector proteins

    Identification of IGF1, SLC4A4, WWOX, and SFMBT1 as Hypertension Susceptibility Genes in Han Chinese with a Genome-Wide Gene-Based Association Study

    Get PDF
    Hypertension is a complex disorder with high prevalence rates all over the world. We conducted the first genome-wide gene-based association scan for hypertension in a Han Chinese population. By analyzing genome-wide single-nucleotide-polymorphism data of 400 matched pairs of young-onset hypertensive patients and normotensive controls genotyped with the Illumina HumanHap550-Duo BeadChip, 100 susceptibility genes for hypertension were identified and also validated with permutation tests. Seventeen of the 100 genes exhibited differential allelic and expression distributions between patient and control groups. These genes provided a good molecular signature for classifying hypertensive patients and normotensive controls. Among the 17 genes, IGF1, SLC4A4, WWOX, and SFMBT1 were not only identified by our gene-based association scan and gene expression analysis but were also replicated by a gene-based association analysis of the Hong Kong Hypertension Study. Moreover, cis-acting expression quantitative trait loci associated with the differentially expressed genes were found and linked to hypertension. IGF1, which encodes insulin-like growth factor 1, is associated with cardiovascular disorders, metabolic syndrome, decreased body weight/size, and changes of insulin levels in mice. SLC4A4, which encodes the electrogenic sodium bicarbonate cotransporter 1, is associated with decreased body weight/size and abnormal ion homeostasis in mice. WWOX, which encodes the WW domain-containing protein, is related to hypoglycemia and hyperphosphatemia. SFMBT1, which encodes the scm-like with four MBT domains protein 1, is a novel hypertension gene. GRB14, TMEM56 and KIAA1797 exhibited highly significant differential allelic and expressed distributions between hypertensive patients and normotensive controls. GRB14 was also found relevant to blood pressure in a previous genetic association study in East Asian populations. TMEM56 and KIAA1797 may be specific to Taiwanese populations, because they were not validated by the two replication studies. Identification of these genes enriches the collection of hypertension susceptibility genes, thereby shedding light on the etiology of hypertension in Han Chinese populations

    Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in east Asians

    Get PDF
    We conducted a three-stage genetic study to identify susceptibility loci for type 2 diabetes (T2D) in east Asian populations. We followed our stage 1 meta-analysis of eight T2D genome-wide association studies (6,952 cases with T2D and 11,865 controls) with a stage 2 in silico replication analysis (5,843 cases and 4,574 controls) and a stage 3 de novo replication analysis (12,284 cases and 13,172 controls). The combined analysis identified eight new T2D loci reaching genome-wide significance, which mapped in or near GLIS3, PEPD, FITM2-R3HDML-HNF4A, KCNK16, MAEA, GCC1-PAX4, PSMD6 and ZFAND3. GLIS3, which is involved in pancreatic beta cell development and insulin gene expression1,2, is known for its association with fasting glucose levels3,4. The evidence of an association with T2D for PEPD5 and HNF4A6,7 has been shown in previous studies. KCNK16 may regulate glucose-dependent insulin secretion in the pancreas. These findings, derived from an east Asian population, provide new perspectives on the etiology of T2D

    Genetic Drivers of Heterogeneity in Type 2 Diabetes Pathophysiology

    Get PDF
    Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P \u3c 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care

    Genetic drivers of heterogeneity in type 2 diabetes pathophysiology

    Get PDF
    Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P &lt; 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care.</p
    corecore