Search CORE

80 research outputs found

Interaction-aware Factorization Machines for Recommender Systems

Author: Chen Ge
Hong Fuxing
Huang Dongbo
Publication venue
Publication date: 26/02/2019
Field of study

Factorization Machine (FM) is a widely used supervised learning approach by effectively modeling of feature interactions. Despite the successful application of FM and its many deep learning variants, treating every feature interaction fairly may degrade the performance. For example, the interactions of a useless feature may introduce noises; the importance of a feature may also differ when interacting with different features. In this work, we propose a novel model named \emph{Interaction-aware Factorization Machine} (IFM) by introducing Interaction-Aware Mechanism (IAM), which comprises the \emph{feature aspect} and the \emph{field aspect}, to learn flexible interactions on two levels. The feature aspect learns feature interaction importance via an attention network while the field aspect learns the feature interaction effect as a parametric similarity of the feature interaction vector and the corresponding field interaction prototype. IFM introduces more structured control and learns feature interaction importance in a stratified manner, which allows for more leverage in tweaking the interactions on both feature-wise and field-wise levels. Besides, we give a more generalized architecture and propose Interaction-aware Neural Network (INN) and DeepIFM to capture higher-order interactions. To further improve both the performance and efficiency of IFM, a sampling scheme is developed to select interactions based on the field aspect importance. The experimental results from two well-known datasets show the superiority of the proposed models over the state-of-the-art methods

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

PI: An open-source software package for validation of the SEQUEST result and visualization of mass spectrum

Author: Bu Dongbo
Qiao Yantao
Sun Shiwei
Zhang Hong
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Tandem mass spectrometry (MS/MS) has emerged as the leading method for high- throughput protein identification in proteomics. Recent technological breakthroughs have dramatically increased the efficiency of MS/MS data generation. Meanwhile, sophisticated algorithms have been developed for identifying proteins from peptide MS/MS data by searching available protein sequence databases for the peptide that is most likely to have produced the observed spectrum. The popular SEQUEST algorithm relies on the cross-correlation between the experimental mass spectrum and the theoretical spectrum of a peptide. It utilizes a simplified fragmentation model that assigns a fixed and identical intensity for all major ions and fixed and lower intensity for their neutral losses. In this way, the common issues involved in predicting theoretical spectra are circumvented. In practice, however, an experimental spectrum is usually not similar to its SEQUEST -predicted theoretical one, and as a result, incorrect identifications are often generated. Results Better understanding of peptide fragmentation is required to produce more accurate and sensitive peptide sequencing algorithms. Here, we designed the software PI of novel and exquisite algorithms that make a good use of intensity property of a spectrum. Conclusions We designed the software PI with the novel and effective algorithms which made a good use of intensity property of the spectrum. Experiments have shown that PI was able to validate and improve the results of SEQUEST to a more satisfactory degree.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Propylthiouracil-induced liver failure and artificial liver support systems: a case report and review of the literature

Author: Dongbo Wu
Enqiang Chen
Hong Tang
Lang Bai
Publication venue: 'Dove Medical Press Ltd.'
Publication date
Field of study

Crossref

ProbPS: A new model for peak selection based on quantifying the dependence of the existence of derivative peaks on primary ion intensity

Author: Bu Dongbo
Sun Shiwei
Wang Yaojun
Zhang Hong
Zhang Shenghui
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The analysis of mass spectra suggests that the existence of derivative peaks is strongly dependent on the intensity of the primary peaks. Peak selection from tandem mass spectrum is used to filter out noise and contaminant peaks. It is widely accepted that a valid primary peak tends to have high intensity and is accompanied by derivative peaks, including isotopic peaks, neutral loss peaks, and complementary peaks. Existing models for peak selection ignore the dependence between the existence of the derivative peaks and the intensity of the primary peaks. Simple models for peak selection assume that these two attributes are independent; however, this assumption is contrary to real data and prone to error. Results In this paper, we present a statistical model to quantitatively measure the dependence of the derivative peak's existence on the primary peak's intensity. Here, we propose a statistical model, named ProbPS, to capture the dependence in a quantitative manner and describe a statistical model for peak selection. Our results show that the quantitative understanding can successfully guide the peak selection process. By comparing ProbPS with AuDeNS we demonstrate the advantages of our method in both filtering out noise peaks and in improving <it>de novo </it>identification. In addition, we present a tag identification approach based on our peak selection method. Our results, using a test data set, suggest that our tag identification method (876 correct tags in 1000 spectra) outperforms PepNovoTag (790 correct tags in 1000 spectra). Conclusions We have shown that ProbPS improves the accuracy of peak selection which further enhances the performance of de novo sequencing and tag identification. Thus, our model saves valuable computation time and improving the accuracy of the results.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Non-invasive preoperative prediction of Edmondson-Steiner grade of hepatocellular carcinoma based on contrast-enhanced ultrasound using ensemble learning

Author: Dongbo Yuan
Fangnan Lu
Hang Sun
Hong Li
Shaoshan Tang
Xiaoguang Pan
Yao Wang
Ying Huang
Publication venue: 'Frontiers Media SA'
Publication date: 01/07/2023
Field of study

PurposeThis study aimed to explore the clinical value of non-invasive preoperative Edmondson-Steiner grade of hepatocellular carcinoma (HCC) using contrast-enhanced ultrasound (CEUS).Methods212 cases of HCCs were retrospectively included, including 83 cases of high-grade HCCs and 129 cases of low-grade HCCs. Three representative CEUS images were selected from the arterial phase, portal vein phase, and delayed phase and stored in a 3-dimensional array. ITK-SNAP was used to segment the tumor lesions manually. The Radiomics method was conducted to extract high-dimensional features on these contrast-enhanced ultrasound images. Then the independent sample T-test and the Least Absolute Shrinkage and Selection Operator (LASSO) were employed to reduce the feature dimensions. The optimized features were modeled by a classifier based on ensemble learning, and the Edmondson Steiner grading was predicted in an independent testing set using this model.ResultsA total of 1338 features were extracted from the 3D images. After the dimension reduction, 10 features were finally selected to establish the model. In the independent testing set, the integrated model performed best, with an AUC of 0.931.ConclusionThis study proposed an Edmondson-Steiner grading method for HCC with CEUS. The method has good classification performance on independent testing sets, which can provide quantitative analysis support for clinical decision-making

Directory of Open Access Journals

Upregulation of Barrel GABAergic Neurons Is Associated with Cross-Modal Plasticity in Olfactory Deficit

Author: AI Gulyas
CCH Petersen
CCH Petersen
CJ McBain
D Bavelier
D Kleinfeld
D Lie
DL Maier
Dongbo Liu
E Davis
EE Fanselow
EM Finney
Fengyu Zhang
H Neville
Hong Ni
J-H Wang
J-H Wang
JH Wang
Jin-Hui Wang
L Ciani
LB Buck
Li Huang
M Beierlein
M Ptito
M Ptito
M Wehr
MA Long
ME Diamond
Ming Ge
MT Wong-Riley
MV Sanchez-Vives
MW Dye
N Chen
N Chen
N Chen
N Chen
Na Chen
O Collignon
P Somogyi
R Ge
RB Towal
RC Kadosh
RF Hevner
Sudong Guan
T Klausberger
T Wieloch
TF Freund
Xiao-Jiang Li
Y Qi
Yan Zhu
Z Yan
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: Loss of a sensory function is often followed by the hypersensitivity of other modalities in mammals, which secures them well-awareness to environmental changes. Cellular and molecular mechanisms underlying cross-modal sensory plasticity remain to be documented. Methodology/Principal Findings: Multidisciplinary approaches, such as electrophysiology, behavioral task and immunohistochemistry, were used to examine the involvement of specific types of neurons in cross-modal plasticity. We have established a mouse model that olfactory deficit leads to a whisking upregulation, and studied how GABAergic neurons are involved in this cross-modal plasticity. In the meantime of inducing whisker tactile hypersensitivity, the olfactory injury recruits more GABAergic neurons and their fine processes in the barrel cortex, as well as upregulates their capacity of encoding action potentials. The hyperpolarization driven by inhibitory inputs strengthens the encoding ability of their target cells. Conclusion/Significance: The upregulation of GABAergic neurons and the functional enhancement of neuronal networks may play an important role in cross-modal sensory plasticity. This finding provides the clues for developing therapeuti

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Genome of Ganderma lucidum Provide Insights into Triterpense Biosynthesis and Wood Degradation

BACKGROUND: Ganoderma lucidum (Reishi or Ling Zhi) is one of the most famous Traditional Chinese Medicines and has been widely used in the treatment of various human diseases in Asia countries. It is also a fungus with strong wood degradation ability with potential in bioenergy production. However, genes, pathways and mechanisms of these functions are still unknown. METHODOLOGY/PRINCIPAL FINDINGS: The genome of G. lucidum was sequenced and assembled into a 39.9 megabases (Mb) draft genome, which encoded 12,080 protein-coding genes and ∼83% of them were similar to public sequences. We performed comprehensive annotation for G. lucidum genes and made comparisons with genes in other fungi genomes. Genes in the biosynthesis of the main G. lucidum active ingredients, ganoderic acids (GAs), were characterized. Among the GAs synthases, we identified a fusion gene, the N and C terminal of which are homologous to two different enzymes. Moreover, the fusion gene was only found in basidiomycetes. As a white rot fungus with wood degradation ability, abundant carbohydrate-active enzymes and ligninolytic enzymes were identified in the G. lucidum genome and were compared with other fungi. CONCLUSIONS/SIGNIFICANCE: The genome sequence and well annotation of G. lucidum will provide new insights in function analyses including its medicinal mechanism. The characterization of genes in the triterpene biosynthesis and wood degradation will facilitate bio-engineering research in the production of its active ingredients and bioenergy

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

The Genomes of Oryza sativa: A History of Duplications

Author: Bao Jingyue
Bu Dongbo
Cao Mengliang
Chen Chen
Chen Huan
Chen Peng
Cong Lijuan
Deng Yajun
Dong Lijun
Dong Lingli
Dong Wei
Fang Lijun
Fang Lin
Gao Lei
Geng Jianing
Han Yujun
Hao Bailin
He Ximiao
Hu Songnian
Hu Wei
Huang Haiyan
Huang Xiangang
Huang Yanqing
Ji Jia
Ji Zhendong
Jiao Yongzhi
Jin Jiao
Lei Meng
Lei Tingting
Li Changfeng
Li Dawei
Li Guangyuan
Li Haihong
Li Heng
Li Jinhong
Li Jun
Li Long
Li Na
Li Ruiqiang
Li Shengting
Li Shuangli
Li Shuting
Li Songgang
Li Wenjie
Li Xianran
Li Yuanzhe
Liang Xiaohu
Lin Liang
Lin Wei
Liu Bin
Liu Dongyuan
Liu Jinsong
Liu Juan
Liu Siqi
Lv Hong
McDermott Jason
Ni Peixiang
Qi Qiuhui
Ran Longhua
Ren Xiaoyu
Samudrala Ram
Shi Jianping
Shi Xiaoli
Su Zhixi
Sun Yongqiao
Tan Jianlong
Tian Xiangjun
Tong Wei
Tong Zongzhong
Wang Jian
Wang Jing
Wang Jingqiang
Wang Jun
Wang Lishun
Wang Wen
Wang Xiaoling
Wang Xiyin
Wei Haibin
Wei Shulin
Wong Gane Ka-Shu
Wu Qingfa
Wu Shuming
Xi Yan
Xiao Ying
Xu Hao
Xu Huayong
Xu Jingyi
Xu Zhao
Xu Zuyuan
Yang Huanming
Yang Li
Ye Chen
Ye Jia
Yin Jianning
Yu Hong
Yu Jun
Yu Yingpu
Yuan Longping
Zeng Changqing
Zhang Bing
Zhang Bo
Zhang Feng
Zhang Jianguo
Zhang Jingfen
Zhang Xiaowei
Zhang Yanling
Zhang Yong
Zhang Yunze
Zhang Zengjin
Zhang Zhenpeng
Zhao Caifeng
Zhao Wenming
Zheng Hongkun
Zheng Weimou
Zhou Jun
Zhou Yan
Zhuang Shulin
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family

Public Library of Science (PLOS)

Crossref

PubMed Central

FigShare