Search CORE

2,045 research outputs found

Cancer based pharmacogenomics network supported with scientific evidences: from the view of drug repurposing

Author
Publication venue: BioMed Central
Publication date: 25/02/2015
Field of study

TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining

Author: A Bensoussan
AP Lu
C Cao
C Zhang
CD Manning
DR Swanson
G Wang
H Ogata
Hsin-Hsi Chen
Hsuan-Cheng Huang
Hsueh-Fen Juan
J Jiang
J Jiang
JT Cheng
K Chan
K Fundel
KC Cheng
L Hu
LA Ramshaw
M He
M Hewett
RA Erhardt
RL Strausberg
S Ananiadou
S Li
S Peri
WS Ko
X Chen
X Chen
X Qiao
X Zhou
Y Feng
Y Jiang
Y Mu
Y Tsuruoka
YC Fang
YC Hseu
Yu-Ching Fang
Z Wu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Traditional Chinese Medicine (TCM), a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal medicines were found to exhibit a variety of effects through regulating a wide range of gene expressions or protein activities. As available TCM data continue to accumulate rapidly, an urgent need for exploring these resources systematically is imperative, so as to effectively utilize the large volume of literature. Methods TCM, gene, disease, biological pathway and protein-protein interaction information were collected from public databases. For association discovery, the TCM names, gene names, disease names, TCM ingredients and effects were used to annotate the literature corpus obtained from PubMed. The concept to mine entity associations was based on hypothesis testing and collocation analysis. The annotated corpus was processed with natural language processing tools and rule-based approaches were applied to the sentences for extracting the relations between TCM effecters and effects. Results We developed a database, TCMGeneDIT, to provide association information about TCMs, genes, diseases, TCM effects and TCM ingredients mined from vast amount of biomedical literature. Integrated protein-protein interaction and biological pathways information are also available for exploring the regulations of genes associated with TCM curative effects. In addition, the transitive relationships among genes, TCMs and diseases could be inferred through the shared intermediates. Furthermore, TCMGeneDIT is useful in understanding the possible therapeutic mechanisms of TCMs via gene regulations and deducing synergistic or antagonistic contributions of the prescription components to the overall therapeutic effects. The database is now available at <url>http://tcm.lifescience.ntu.edu.tw/</url>. Conclusion TCMGeneDIT is a unique database that offers diverse association information on TCMs. This database integrates TCMs with biomedical studies that would facilitate clinical research and elucidate the possible therapeutic mechanisms of TCMs and gene regulations.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

National Taiwan University Repository

Literature Based Discovery (LBD): Towards Hypothesis Generation and Knowledge Discovery in Biomedical Text Mining

Author: Bhasuran Balu
Murugesan Gurusamy
Natarajan Jeyakumar
Publication venue
Publication date: 03/10/2023
Field of study

Biomedical knowledge is growing in an astounding pace with a majority of this knowledge is represented as scientific publications. Text mining tools and methods represents automatic approaches for extracting hidden patterns and trends from this semi structured and unstructured data. In Biomedical Text mining, Literature Based Discovery (LBD) is the process of automatically discovering novel associations between medical terms otherwise mentioned in disjoint literature sets. LBD approaches proven to be successfully reducing the discovery time of potential associations that are hidden in the vast amount of scientific literature. The process focuses on creating concept profiles for medical terms such as a disease or symptom and connecting it with a drug and treatment based on the statistical significance of the shared profiles. This knowledge discovery approach introduced in 1989 still remains as a core task in text mining. Currently the ABC principle based two approaches namely open discovery and closed discovery are mostly explored in LBD process. This review starts with general introduction about text mining followed by biomedical text mining and introduces various literature resources such as MEDLINE, UMLS, MESH, and SemMedDB. This is followed by brief introduction of the core ABC principle and its associated two approaches open discovery and closed discovery in LBD process. This review also discusses the deep learning applications in LBD by reviewing the role of transformer models and neural networks based LBD models and its future aspects. Finally, reviews the key biomedical discoveries generated through LBD approaches in biomedicine and conclude with the current limitations and future directions of LBD.Comment: 43 Pages, 5 Figures, 4 Table

arXiv.org e-Print Archive

GEOGLE: context mining tool for the correlation between gene expression and the phenotypic distinction

Author: Ding Guohui
Hao Pei
Li Yixue
Li Yun
Ping Jie
Tu Kang
Yu Yao
Zheng Siyuan
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background In the post-genomic era, the development of high-throughput gene expression detection technology provides huge amounts of experimental data, which challenges the traditional pipelines for data processing and analyzing in scientific researches. Results In our work, we integrated gene expression information from Gene Expression Omnibus (GEO), biomedical ontology from Medical Subject Headings (MeSH) and signaling pathway knowledge from sigPathway entries to develop a context mining tool for gene expression analysis – GEOGLE. GEOGLE offers a rapid and convenient way for searching relevant experimental datasets, pathways and biological terms according to multiple types of queries: including biomedical vocabularies, GDS IDs, gene IDs, pathway names and signature list. Moreover, GEOGLE summarizes the signature genes from a subset of GDSes and estimates the correlation between gene expression and the phenotypic distinction with an integrated p value. Conclusion This approach performing global searching of expression data may expand the traditional way of collecting heterogeneous gene expression experiment data. GEOGLE is a novel tool that provides researchers a quantitative way to understand the correlation between gene expression and phenotypic distinction through meta-analysis of gene expression datasets from different experiments, as well as the biological meaning behind. The web site and user guide of GEOGLE are available at: <url>http://omics.biosino.org:14000/kweb/workflow.jsp?id=00020</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Detection of the inferred interaction network in hepatocellular carcinoma from EHCO (Encyclopedia of Hepatocellular Carcinoma genes Online)

Author: Chang Chao-Hui
Chen Chang-Han
Hsu Chun-Nan
Hsu Wen-Lian
Huang Chi-Ying F
Kao Cheng-Yan
Lai Jin-Mei
Lee DT
Lee Gen-Cher
Lee Sheng-An
Lin Chih-Yun
Lin Kuan-Ting
Liu Chia-Hung
Shiue Yow-Ling
Su Li-Jen
Sung Ting-Yi
Tseng Huei-Hun
Yeh Chang-Wei
Yeh Hsu-Hua
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: The significant advances in microarray and proteomics analyses have resulted in an exponential increase in potential new targets and have promised to shed light on the identification of disease markers and cellular pathways. We aim to collect and decipher the HCC-related genes at the systems level. RESULTS: Here, we build an integrative platform, the Encyclopedia of Hepatocellular Carcinoma genes Online, dubbed EHCO , to systematically collect, organize and compare the pileup of unsorted HCC-related studies by using natural language processing and softbots. Among the eight gene set collections, ranging across PubMed, SAGE, microarray, and proteomics data, there are 2,906 genes in total; however, more than 77% genes are only included once, suggesting that tremendous efforts need to be exerted to characterize the relationship between HCC and these genes. Of these HCC inventories, protein binding represents the largest proportion (~25%) from Gene Ontology analysis. In fact, many differentially expressed gene sets in EHCO could form interaction networks (e.g. HBV-associated HCC network) by using available human protein-protein interaction datasets. To further highlight the potential new targets in the inferred network from EHCO, we combine comparative genomics and interactomics approaches to analyze 120 evolutionary conserved and overexpressed genes in HCC. 47 out of 120 queries can form a highly interactive network with 18 queries serving as hubs. CONCLUSION: This architectural map may represent the first step toward the attempt to decipher the hepatocarcinogenesis at the systems level. Targeting hubs and/or disruption of the network formation might reveal novel strategy for HCC treatment

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

National Chung Hsing University Institutional Repository

National Health Research Institues

PubMed Central

National Taiwan University Repository

Using Unsupervised Patterns to Extract Gene Regulation Relationships for Network Construction

Author: A Ozgur
BJ Stapley
C Blaschke
C Nedellec
C Rodriguez-Penagos
CC van der Eijk
CF Schaefer
D Klein
D Klein
Dongxiao Zhu
E Buyko
Hei-Chia Wang
HM Muller
Hung-Yu Kao
J Saric
J Saric
JH Chiang
K Fundel
L Tanabe
M Huang
R Chowdhary
R Hoffmann
R Jelier
S Kim
S Pyysalo
Shaw-Jenq Tsai
Shuo-Jang Li
T Ono
TK Jenssen
U Hahn
Yi-Tsung Tang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

BACKGROUND: The gene expression is usually described in the literature as a transcription factor X that regulates the target gene Y. Previously, some studies discovered gene regulations by using information from the biomedical literature and most of them require effort of human annotators to build the training dataset. Moreover, the large amount of textual knowledge recorded in the biomedical literature grows very rapidly, and the creation of manual patterns from literatures becomes more difficult. There is an increasing need to automate the process of establishing patterns. METHODOLOGY/PRINCIPAL FINDINGS: In this article, we describe an unsupervised pattern generation method called AutoPat. It is a gene expression mining system that can generate unsupervised patterns automatically from a given set of seed patterns. The high scalability and low maintenance cost of the unsupervised patterns could help our system to extract gene expression from PubMed abstracts more precisely and effectively. CONCLUSIONS/SIGNIFICANCE: Experiments on several regulators show reasonable precision and recall rates which validate AutoPat's practical applicability. The conducted regulation networks could also be built precisely and effectively. The system in this study is available at http://ikmbio.csie.ncku.edu.tw/AutoPat/

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

An integrated method for cancer classification and rule extraction from microarray data

Author: Huang Liang-Tsung
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Different microarray techniques recently have been successfully used to investigate useful information for cancer diagnosis at the gene expression level due to their ability to measure thousands of gene expression levels in a massively parallel way. One important issue is to improve classification performance of microarray data. However, it would be ideal that influential genes and even interpretable rules can be explored at the same time to offer biological insight

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Systematic Analysis of Endometrial Cancer-Associated Hub Proteins Based on Text Mining

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

Applications of New Technologies and New Methods in ZHENG Differentiation

Author: Cao Huijuan
Dai Jianye
Gou Xiaojun
Su Shibing
Sun Shujun
Wang Wenyu
Zhang Yongyu
Zheng Ningning
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2012
Field of study

With the hope to provide an effective approach for personalized diagnosis and treatment clinically, Traditional Chinese Medicine (TCM) is being paid increasing attention as a complementary and alternative medicine. It performs treatment based on ZHENG (TCM syndrome) differentiation, which could be identified as clinical special phenotypes by symptoms and signs of patients. However, it caused skepticism and criticism because ZHENG classification only depends on observation, knowledge, and clinical experience of TCM practitioners, which is lack of objectivity and repeatability. Scientists have done fruitful researches for its objectivity and standardization. Compared with traditional four diagnostic methods (looking, listening and smelling, asking, and touching), in this paper, the applications of new technologies and new methods on the ZHENG differentiation were systemically reviewed, including acquisition, analysis, and integration of clinical data or information. Furthermore, the characteristics and application range of these technologies and methods were summarized. It will provide reference for further researches

Directory of Open Access Journals

PubMed Central

Network based integrated analysis of phenotype-genotype data for prioritization of candidate symptom genes

Author: Hu J.
Jia C.
Li X.
Liu B.
Peng Yonghong
Sun C.
Yu J.
Zhang R.
Zhou X.
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

YesSymptoms and signs (symptoms in brief) are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM). To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. The proposed method gets reliable gene rank list with AUC (area under curve) 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.NSFC Project (61105055, 81230086), China 973 Program (2014CB542903), The National Key Technology R&D Program (2013BAI02B01, 2013BAI13B04), the National S&T Major Special Project on Major New Drug Innovation (2012ZX09503-001-003), and the Fundamental Research Funds for the Central Universities

Crossref

Directory of Open Access Journals

PubMed Central

Bradford Scholars