22 research outputs found

    In search of the right literature search engine(s)

    Get PDF
    *Background*
Collecting scientific publications related to a specific topic is crucial for different phases of research, health care and ‘effective text mining’. Available bio-literature search engines vary in their ability to scan different sections of articles, for the user-provided search terms and/or phrases. Since a thorough scientific analysis of all major bibliographic tools has not been done, their selection has often remained subjective. We have considered most of the existing bio-literature search engines (http://www.shodhaka.com/startbioinfo/LitSearch.html) and performed an extensive analysis of 18 literature search engines, over a period of about 3 years. Eight different topics were taken and about 50 searches were performed using the selected search engines. The relevance of retrieved citations was carefully assessed after every search, to estimate the citation retrieval efficiency. Different other features of the search tools were also compared using a semi-quantitative method.
*Results*
The study provides the first tangible comparative account of relative retrieval efficiency, input and output features, resource coverage and a few other utilities of the bio-literature search tools. The results show that using a single search tool can lead to loss of up to 75% relevant citations in some cases. Hence, use of multiple search tools is recommended. But, it would also not be practical to use all or too many search engines. The detailed observations made in the study can assist researchers and health professionals in making a more objective selection among the search engines. A corollary study revealed relative advantages and disadvantages of the full-text scanning tools.
*Conclusion*
While many studies have attempted to compare literature search engines, important questions remained unanswered till date. Following are some of those questions, along with answers provided by the current study:
a)	Which tools should be used to get the maximum number of relevant citations with a reasonable effort? ANSWER: _Using PubMed, Scopus, Google Scholar and HighWire Press individually, and then compiling the hits into a union list is the best option. Citation-Compiler (http://www.shodhaka.com/compiler) can help to compile the results from each of the recommended tool._
b)	What is the approximate percentage of relevant citations expected to be lost if only one search engine is used? ANSWER: _About 39% of the total relevant citations were lost in searches across 4 topics; 49% hits were lost while using PubMed or HighWire Press, while 37% and 20% loss was noticed while using Google Scholar and Scopus, respectively._ 
c)	Which full text search engines can be recommended in general? ANSWER: _HighWire Press and Google Scholar._
d)	Among the mostly used search engines, which one can be recommended for best precision? ANSWER: _EBIMed._
e)	Among the mostly used search engines, which one can be recommended for best recall? ANSWER: _Depending on the type of query used, best recall could be obtained by HighWire Press or Scopus.

    A novel tissue-specific meta-analysis approach for gene expression predictions, initiated with a mammalian gene expression testis database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the recent years, there has been a rise in gene expression profiling reports. Unfortunately, it has not been possible to make maximum use of available gene expression data. Many databases and programs can be used to derive the possible expression patterns of mammalian genes, based on existing data. However, these available resources have limitations. For example, it is not possible to obtain a list of genes that are expressed in certain conditions. To overcome such limitations, we have taken up a new strategy to predict gene expression patterns using available information, for one tissue at a time.</p> <p>Results</p> <p>The first step of this approach involved manual collection of maximum data derived from large-scale (genome-wide) gene expression studies, pertaining to mammalian testis. These data have been compiled into a Mammalian Gene Expression Testis-database (MGEx-Tdb). This process resulted in a richer collection of gene expression data compared to other databases/resources, for multiple testicular conditions. The gene-lists collected this way in turn were exploited to derive a 'consensus' expression status for each gene, across studies. The expression information obtained from the newly developed database mostly agreed with results from multiple small-scale studies on selected genes. A comparative analysis showed that MGEx-Tdb can retrieve the gene expression information more efficiently than other commonly used databases. It has the ability to provide a clear expression status (transcribed or dormant) for most genes, in the testis tissue, under several specific physiological/experimental conditions and/or cell-types.</p> <p>Conclusions</p> <p>Manual compilation of gene expression data, which can be a painstaking process, followed by a consensus expression status determination for specific locations and conditions, can be a reliable way of making use of the existing data to predict gene expression patterns. MGEx-Tdb provides expression information for 14 different combinations of specific locations and conditions in humans (25,158 genes), 79 in mice (22,919 genes) and 23 in rats (14,108 genes). It is also the first system that can predict expression of genes with a 'reliability-score', which is calculated based on the extent of agreements and contradictions across gene-sets/studies. This new platform is publicly available at the following web address: <url>http://resource.ibab.ac.in/MGEx-Tdb/</url></p

    MGEx-Udb: A Mammalian Uterus Database for Expression-Based Cataloguing of Genes across Conditions, Including Endometriosis and Cervical Cancer

    Get PDF
    Gene expression profiling of uterus tissue has been performed in various contexts, but a significant amount of the data remains underutilized as it is not covered by the existing general resources.). The database can be queried with gene names/IDs, sub-tissue locations, as well as various conditions such as the cervical cancer, endometrial cycles and disorders, and experimental treatments. Accordingly, the output would be a) transcribed and dormant genes listed for the queried condition/location, or b) expression profile of the gene of interest in various uterine conditions. The results also include the reliability score for the expression status of each gene. MGEx-Udb also provides information related to Gene Ontology annotations, protein-protein interactions, transcripts, promoters, and expression status by other sequencing techniques, and facilitates various other types of analysis of the individual genes or co-expressed gene clusters.In brief, MGEx-Udb enables easy cataloguing of co-expressed genes and also facilitates bio-marker discovery for various uterine conditions

    Example hierarchy of the conditions and sub-conditions.

    No full text
    <p>An example <i>(“stage IIA non-keratinizing squamous cell cervical carcinoma”)</i> hierarchy of the conditions and sub-conditions, for which data have been collected, and drop-down options provided in the query and upload pages of MGEx-Udb. Currently the database allows up to four levels of the hierarchy to query.</p

    Datasets

    No full text
    <p>(<b>with gene count</b>) <b>collected from various sources.</b> In case of “PubMed & GEO” and “PubMed & ArrayExpress”, smaller gene lists came from validation experiments and were collected from PubMed, while raw/processed data were always collected from the repositories (GEO/ArrayExpress).</p

    Number of datasets

    No full text
    <p>(<b>and studies</b>) <b>in MGEx-Udb corresponding to various physiological and pathological uterine conditions.</b> ‘Others’ represent post-parturition, genetic-ablation, artificial insemination and embryo implantation. Studies considering tissues that are used as controls but may not be absolutely ‘normal’ have been grouped in <i>‘may be normal’</i> category (examples: “normal tissue adjacent to tumor/cancer tissue”, “vehicle-treated”).</p

    Schematic representation of MGEx-Udb.

    No full text
    <p>The figure represents the data collection (top portion), architecture (central portion) and operation (bottom portion) of the database.</p

    Source of data across various mammalian species in MGEx-Udb.

    No full text
    <p>Other species include cow and pig. Among the data collected from GEO or “PubMed & GEO”, 85% of the studies were also present in ArrayExpress, even though this is not indicated in the figure.</p

    Endometrial Receptivity: A Revisit to Functional Genomics Studies on Human Endometrium and Creation of HGEx-ERdb

    Get PDF
    <div><p>Background</p><p>Endometrium acquires structural and functional competence for embryo implantation only during the receptive phase of menstrual cycle in fertile women. Sizeable data are available to indicate that this ability is acquired by modulation in the expression of several genes/gene products. However, there exists little consensus on the identity, number of expressed/not-detected genes and their pattern of expression (up or down regulation).</p> <p>Methods</p><p>Literature search was carried out to retrieve the data on endometrial expression of genes/proteins in various conditions. Data were compiled to generate a comprehensive database, Human Gene Expression Endometrial Receptivity database (HGEx-ERdb). The database was used to identify the Receptivity Associated Genes (RAGs) which display the similar pattern of expression across different investigations. Transcript levels of select RAGs encoding cell adhesion proteins were compared between two human endometrial epithelial cell lines; RL95-2 and HEC-1-A by quantitative real time polymerase chain reaction (q-RT-PCR). Further select RAGs were investigated for their expression in pre-receptive (n = 4) and receptive phase (n = 4) human endometrial tissues by immunohistochemical studies. JAr spheroid attachment assays were carried out to assess the functional significance of two RAGs.</p> <p>Results</p><p>HGEx-ERdb (<a href="http://resource.ibab.ac.in/HGEx-ERdb/" target="_blank">http://resource.ibab.ac.in/HGEx-ERdb/</a>) helped identification of 179 RAGs, of which 151 genes were consistently expressed and upregulated and 28 consistently not-detected and downregulated in receptive phase as compared to pre-receptive phase. q-RT-PCR confirmed significantly higher (p<0.005) expression of Thrombospondin1 (THBS1), CD36 and Mucin 16 transcripts, in RL95-2 as compared to HEC-1-A. Further, the pretreatment with antibodies against CD36 and COMP led to a reduction in the percentage of JAr spheroids attached to RL95-2. Immunohistochemical studies demonstrated significantly higher (p<0.05) expression of endometrial THBS1, Cartilage Oligomeric Matrix Protein (COMP) and CD36 in the receptive phase as compared to pre-receptive phase human endometrial tissues.</p> <p>Conclusion</p><p>HGEx-ERdb is a catalogue of 19,285 genes, reported for their expression in human endometrium. Further 179 genes were identified as the RAGs. Expression analysis of some RAGs validated the utility of approach employed in creation of HGEx-ERdb. Studies aimed towards defining the specific functions of RAGs and their potential networks may yield relevant information about the major ‘nodes’ which regulate endometrial receptivity.</p> </div
    corecore