23 research outputs found
A novel tissue-specific meta-analysis approach for gene expression predictions, initiated with a mammalian gene expression testis database
<p>Abstract</p> <p>Background</p> <p>In the recent years, there has been a rise in gene expression profiling reports. Unfortunately, it has not been possible to make maximum use of available gene expression data. Many databases and programs can be used to derive the possible expression patterns of mammalian genes, based on existing data. However, these available resources have limitations. For example, it is not possible to obtain a list of genes that are expressed in certain conditions. To overcome such limitations, we have taken up a new strategy to predict gene expression patterns using available information, for one tissue at a time.</p> <p>Results</p> <p>The first step of this approach involved manual collection of maximum data derived from large-scale (genome-wide) gene expression studies, pertaining to mammalian testis. These data have been compiled into a Mammalian Gene Expression Testis-database (MGEx-Tdb). This process resulted in a richer collection of gene expression data compared to other databases/resources, for multiple testicular conditions. The gene-lists collected this way in turn were exploited to derive a 'consensus' expression status for each gene, across studies. The expression information obtained from the newly developed database mostly agreed with results from multiple small-scale studies on selected genes. A comparative analysis showed that MGEx-Tdb can retrieve the gene expression information more efficiently than other commonly used databases. It has the ability to provide a clear expression status (transcribed or dormant) for most genes, in the testis tissue, under several specific physiological/experimental conditions and/or cell-types.</p> <p>Conclusions</p> <p>Manual compilation of gene expression data, which can be a painstaking process, followed by a consensus expression status determination for specific locations and conditions, can be a reliable way of making use of the existing data to predict gene expression patterns. MGEx-Tdb provides expression information for 14 different combinations of specific locations and conditions in humans (25,158 genes), 79 in mice (22,919 genes) and 23 in rats (14,108 genes). It is also the first system that can predict expression of genes with a 'reliability-score', which is calculated based on the extent of agreements and contradictions across gene-sets/studies. This new platform is publicly available at the following web address: <url>http://resource.ibab.ac.in/MGEx-Tdb/</url></p
MGEx-Udb: A Mammalian Uterus Database for Expression-Based Cataloguing of Genes across Conditions, Including Endometriosis and Cervical Cancer
Gene expression profiling of uterus tissue has been performed in various contexts, but a significant amount of the data remains underutilized as it is not covered by the existing general resources.). The database can be queried with gene names/IDs, sub-tissue locations, as well as various conditions such as the cervical cancer, endometrial cycles and disorders, and experimental treatments. Accordingly, the output would be a) transcribed and dormant genes listed for the queried condition/location, or b) expression profile of the gene of interest in various uterine conditions. The results also include the reliability score for the expression status of each gene. MGEx-Udb also provides information related to Gene Ontology annotations, protein-protein interactions, transcripts, promoters, and expression status by other sequencing techniques, and facilitates various other types of analysis of the individual genes or co-expressed gene clusters.In brief, MGEx-Udb enables easy cataloguing of co-expressed genes and also facilitates bio-marker discovery for various uterine conditions
Schematic representation of MGEx-Udb.
<p>The figure represents the data collection (top portion), architecture (central portion) and operation (bottom portion) of the database.</p
Datasets
<p>(<b>with gene count</b>) <b>collected from various sources.</b> In case of “PubMed & GEO” and “PubMed & ArrayExpress”, smaller gene lists came from validation experiments and were collected from PubMed, while raw/processed data were always collected from the repositories (GEO/ArrayExpress).</p
Number of datasets
<p>(<b>and studies</b>) <b>in MGEx-Udb corresponding to various physiological and pathological uterine conditions.</b> ‘Others’ represent post-parturition, genetic-ablation, artificial insemination and embryo implantation. Studies considering tissues that are used as controls but may not be absolutely ‘normal’ have been grouped in <i>‘may be normal’</i> category (examples: “normal tissue adjacent to tumor/cancer tissue”, “vehicle-treated”).</p
Example hierarchy of the conditions and sub-conditions.
<p>An example <i>(“stage IIA non-keratinizing squamous cell cervical carcinoma”)</i> hierarchy of the conditions and sub-conditions, for which data have been collected, and drop-down options provided in the query and upload pages of MGEx-Udb. Currently the database allows up to four levels of the hierarchy to query.</p
Source of data across various mammalian species in MGEx-Udb.
<p>Other species include cow and pig. Among the data collected from GEO or “PubMed & GEO”, 85% of the studies were also present in ArrayExpress, even though this is not indicated in the figure.</p
Identification of Potential Biomarkers for Group I Pulmonary Hypertension Based on Machine Learning and Bioinformatics Analysis
A number of processes and pathways have been reported in the development of Group I pulmonary hypertension (Group I PAH); however, novel biomarkers need to be identified for a better diagnosis and management. We employed a robust rank aggregation (RRA) algorithm to shortlist the key differentially expressed genes (DEGs) between Group I PAH patients and controls. An optimal diagnostic model was obtained by comparing seven machine learning algorithms and was verified in an independent dataset. The functional roles of key DEGs and biomarkers were analyzed using various in silico methods. Finally, the biomarkers and a set of key candidates were experimentally validated using patient samples and a cell line model. A total of 48 key DEGs with preferable diagnostic value were identified. A gradient boosting decision tree algorithm was utilized to build a diagnostic model with three biomarkers, PBRM1, CA1, and TXLNG. An immune-cell infiltration analysis revealed significant differences in the relative abundances of seven immune cells between controls and PAH patients and a correlation with the biomarkers. Experimental validation confirmed the upregulation of the three biomarkers in Group I PAH patients. In conclusion, machine learning and a bioinformatics analysis along with experimental techniques identified PBRM1, CA1, and TXLNG as potential biomarkers for Group I PAH
Endometrial Receptivity: A Revisit to Functional Genomics Studies on Human Endometrium and Creation of HGEx-ERdb
<div><p>Background</p><p>Endometrium acquires structural and functional competence for embryo implantation only during the receptive phase of menstrual cycle in fertile women. Sizeable data are available to indicate that this ability is acquired by modulation in the expression of several genes/gene products. However, there exists little consensus on the identity, number of expressed/not-detected genes and their pattern of expression (up or down regulation).</p> <p>Methods</p><p>Literature search was carried out to retrieve the data on endometrial expression of genes/proteins in various conditions. Data were compiled to generate a comprehensive database, Human Gene Expression Endometrial Receptivity database (HGEx-ERdb). The database was used to identify the Receptivity Associated Genes (RAGs) which display the similar pattern of expression across different investigations. Transcript levels of select RAGs encoding cell adhesion proteins were compared between two human endometrial epithelial cell lines; RL95-2 and HEC-1-A by quantitative real time polymerase chain reaction (q-RT-PCR). Further select RAGs were investigated for their expression in pre-receptive (n = 4) and receptive phase (n = 4) human endometrial tissues by immunohistochemical studies. JAr spheroid attachment assays were carried out to assess the functional significance of two RAGs.</p> <p>Results</p><p>HGEx-ERdb (<a href="http://resource.ibab.ac.in/HGEx-ERdb/" target="_blank">http://resource.ibab.ac.in/HGEx-ERdb/</a>) helped identification of 179 RAGs, of which 151 genes were consistently expressed and upregulated and 28 consistently not-detected and downregulated in receptive phase as compared to pre-receptive phase. q-RT-PCR confirmed significantly higher (p<0.005) expression of Thrombospondin1 (THBS1), CD36 and Mucin 16 transcripts, in RL95-2 as compared to HEC-1-A. Further, the pretreatment with antibodies against CD36 and COMP led to a reduction in the percentage of JAr spheroids attached to RL95-2. Immunohistochemical studies demonstrated significantly higher (p<0.05) expression of endometrial THBS1, Cartilage Oligomeric Matrix Protein (COMP) and CD36 in the receptive phase as compared to pre-receptive phase human endometrial tissues.</p> <p>Conclusion</p><p>HGEx-ERdb is a catalogue of 19,285 genes, reported for their expression in human endometrium. Further 179 genes were identified as the RAGs. Expression analysis of some RAGs validated the utility of approach employed in creation of HGEx-ERdb. Studies aimed towards defining the specific functions of RAGs and their potential networks may yield relevant information about the major ‘nodes’ which regulate endometrial receptivity.</p> </div