229 research outputs found
Recommended from our members
An improved hidden vector state model approach and its adaptation in extracting protein interaction information from biomedical literature
Large quantity of knowledge, which is important for biological researchers to unveil the mechanism of life, often hides in the literature, such as journal articles, reports, books and so on. Many approaches focusing on extracting information from unstructured text, such as pattern matching, shallow and full parsing, have been proposed especially for biomedical applications. In this paper, we present an information extraction system employing a semantic parser using the Hidden Vector State (HVS) model for protein-protein interactions. We found that it performed better than other established statistical methods and achieved 58.3% and 76.8% in recall and precision respectively. Moreover, the pure data-driven HVS model can be easily adapted to other domains, which is rarely mentioned and possessed by other approaches. Experimental results prove that the model trained on one domain can still generate satisfactory results when shifting to another domain with a small amount of adaptation training data
Recommended from our members
Effective reranking for extracting protein-protein interactions from biomedical literature
A semantic parser based on the hidden vector state (HVS) model has been proposed for extracting protein-protein interactions. The HVS model is an extension of the basic discrete hidden Markov model, in which context is encoded as a stack-oriented state vector and state transitions are factored into a stack shift operation followed by the push of a new preterminal category label. In this paper, we investigate three different models, log-linear regression (LLR), neural networks (NNs) and support vector machines (SVMs), to rerank parses generated by the HVS model for protein-protein interactions extraction. Features used for reranking are manually deямБned which include the parse information, the structure information, and the complexity information. The experimental results show that reranking can indeed improve the performance of protein-protein interactions extraction, and reranking based on SVM gives more stable performance than LLR and NN
Gene, Environment and Methylation (GEM): a tool suite to efficiently navigate large scale epigenome wide association studies and integrate genotype and interaction between genotype and environment
10.1186/s12859-016-1161-zBMC bioinformatics171Article number 299GUSTO (Growing up towards Healthy Outcomes
Dual Stage Stylization Modulation for Domain Generalized Semantic Segmentation
Obtaining sufficient labeled data for training deep models is often
challenging in real-life applications. To address this issue, we propose a
novel solution for single-source domain generalized semantic segmentation.
Recent approaches have explored data diversity enhancement using hallucination
techniques. However, excessive hallucination can degrade performance,
particularly for imbalanced datasets. As shown in our experiments, minority
classes are more susceptible to performance reduction due to hallucination
compared to majority classes. To tackle this challenge, we introduce a
dual-stage Feature Transform (dFT) layer within the Adversarial Semantic
Hallucination+ (ASH+) framework. The ASH+ framework performs a dual-stage
manipulation of hallucination strength. By leveraging semantic information for
each pixel, our approach adaptively adjusts the pixel-wise hallucination
strength, thus providing fine-grained control over hallucination. We validate
the effectiveness of our proposed method through comprehensive experiments on
publicly available semantic segmentation benchmark datasets (Cityscapes and
SYNTHIA). Quantitative and qualitative comparisons demonstrate that our
approach is competitive with state-of-the-art methods for the Cityscapes
dataset and surpasses existing solutions for the SYNTHIA dataset. Code for our
framework will be made readily available to the research community
Ultra-Scalable Spectral Clustering and Ensemble Clustering
This paper focuses on scalability and robustness of spectral clustering for
extremely large-scale datasets with limited resources. Two novel algorithms are
proposed, namely, ultra-scalable spectral clustering (U-SPEC) and
ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative
selection strategy and a fast approximation method for K-nearest
representatives are proposed for the construction of a sparse affinity
sub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, the
transfer cut is then utilized to efficiently partition the graph and obtain the
clustering result. In U-SENC, multiple U-SPEC clusterers are further integrated
into an ensemble clustering framework to enhance the robustness of U-SPEC while
maintaining high efficiency. Based on the ensemble generation via multiple
U-SEPC's, a new bipartite graph is constructed between objects and base
clusters and then efficiently partitioned to achieve the consensus clustering
result. It is noteworthy that both U-SPEC and U-SENC have nearly linear time
and space complexity, and are capable of robustly and efficiently partitioning
ten-million-level nonlinearly-separable datasets on a PC with 64GB memory.
Experiments on various large-scale datasets have demonstrated the scalability
and robustness of our algorithms. The MATLAB code and experimental data are
available at https://www.researchgate.net/publication/330760669.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering,
201
- тАж