Search CORE

229 research outputs found

Recommended from our members

An improved hidden vector state model approach and its adaptation in extracting protein interaction information from biomedical literature

Author: He Yulan
Kwoh Chee Keong
Zhou Deyu
Publication venue
Publication date: 01/01/2006
Field of study

Large quantity of knowledge, which is important for biological researchers to unveil the mechanism of life, often hides in the literature, such as journal articles, reports, books and so on. Many approaches focusing on extracting information from unstructured text, such as pattern matching, shallow and full parsing, have been proposed especially for biomedical applications. In this paper, we present an information extraction system employing a semantic parser using the Hidden Vector State (HVS) model for protein-protein interactions. We found that it performed better than other established statistical methods and achieved 58.3% and 76.8% in recall and precision respectively. Moreover, the pure data-driven HVS model can be easily adapted to other domains, which is rarely mentioned and possessed by other approaches. Experimental results prove that the model trained on one domain can still generate satisfactory results when shifting to another domain with a small amount of adaptation training data

Open Research Online

Recommended from our members

Effective reranking for extracting protein-protein interactions from biomedical literature

Author: He Yulan
Kwoh Chee Keong
Zhou Deyu
Publication venue
Publication date: 01/01/2007
Field of study

A semantic parser based on the hidden vector state (HVS) model has been proposed for extracting protein-protein interactions. The HVS model is an extension of the basic discrete hidden Markov model, in which context is encoded as a stack-oriented state vector and state transitions are factored into a stack shift operation followed by the push of a new preterminal category label. In this paper, we investigate three different models, log-linear regression (LLR), neural networks (NNs) and support vector machines (SVMs), to rerank parses generated by the HVS model for protein-protein interactions extraction. Features used for reranking are manually deﬁned which include the parse information, the structure information, and the complexity information. The experimental results show that reranking can indeed improve the performance of protein-protein interactions extraction, and reranking based on SVM gives more stable performance than LLR and NN

Open Research Online

Gene, Environment and Methylation (GEM): a tool suite to efficiently navigate large scale epigenome wide association studies and integrate genotype and interaction between genotype and environment

Author: Holbrook Jonna D.
Karnani Neerja
Kwoh Chee Keong
Pan Hong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/08/2016
Field of study

10.1186/s12859-016-1161-zBMC bioinformatics171Article number 299GUSTO (Growing up towards Healthy Outcomes

ScholarBank@NUS

Dual Stage Stylization Modulation for Domain Generalized Semantic Segmentation

Author: Kwoh Chee-Keong
Liu Ping
Tjio Gabriel
Zhou Joey Tianyi
Publication venue
Publication date: 29/07/2023
Field of study

Obtaining sufficient labeled data for training deep models is often challenging in real-life applications. To address this issue, we propose a novel solution for single-source domain generalized semantic segmentation. Recent approaches have explored data diversity enhancement using hallucination techniques. However, excessive hallucination can degrade performance, particularly for imbalanced datasets. As shown in our experiments, minority classes are more susceptible to performance reduction due to hallucination compared to majority classes. To tackle this challenge, we introduce a dual-stage Feature Transform (dFT) layer within the Adversarial Semantic Hallucination+ (ASH+) framework. The ASH+ framework performs a dual-stage manipulation of hallucination strength. By leveraging semantic information for each pixel, our approach adaptively adjusts the pixel-wise hallucination strength, thus providing fine-grained control over hallucination. We validate the effectiveness of our proposed method through comprehensive experiments on publicly available semantic segmentation benchmark datasets (Cityscapes and SYNTHIA). Quantitative and qualitative comparisons demonstrate that our approach is competitive with state-of-the-art methods for the Cityscapes dataset and surpasses existing solutions for the SYNTHIA dataset. Code for our framework will be made readily available to the research community

arXiv.org e-Print Archive

Ultra-Scalable Spectral Clustering and Ensemble Clustering

Author: Huang Dong
Kwoh Chee-Keong
Lai Jian-Huang
Wang Chang-Dong
Wu Jian-Sheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This paper focuses on scalability and robustness of spectral clustering for extremely large-scale datasets with limited resources. Two novel algorithms are proposed, namely, ultra-scalable spectral clustering (U-SPEC) and ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative selection strategy and a fast approximation method for K-nearest representatives are proposed for the construction of a sparse affinity sub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, the transfer cut is then utilized to efficiently partition the graph and obtain the clustering result. In U-SENC, multiple U-SPEC clusterers are further integrated into an ensemble clustering framework to enhance the robustness of U-SPEC while maintaining high efficiency. Based on the ensemble generation via multiple U-SEPC's, a new bipartite graph is constructed between objects and base clusters and then efficiently partitioned to achieve the consensus clustering result. It is noteworthy that both U-SPEC and U-SENC have nearly linear time and space complexity, and are capable of robustly and efficiently partitioning ten-million-level nonlinearly-separable datasets on a PC with 64GB memory. Experiments on various large-scale datasets have demonstrated the scalability and robustness of our algorithms. The MATLAB code and experimental data are available at https://www.researchgate.net/publication/330760669.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering, 201

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)