Search CORE

20 research outputs found

A novel index of protein-protein interface propensity improves interface residue recognition

Author: AA Bogan
Aiping Wu
AJ McCoy
C Yan
D Reichmann
ED Levy
F Glaser
FB Sheinerman
FP Davis
G Casari
G Sudha
H Chen
H Hwang
H Neuvirth
H-X Zhou
I Bahar
IM Nooren
J Janin
J Janin
J Janin
J Li
J Mintseris
JH Lakey
Liangxiao Ma
LL Conte
M Vidal
MF Raih
N Zhao
NJ Agrawal
NK Fox
O Keskin
O Keskin
O Keskin
P Chakrabarti
RP Bahadur
RP Bahadur
S Henikoff
S Jones
S Kawashima
S Liang
S Qin
SF Altschul
SH Khan
T Clackson
Taijiao Jiang
Wentao Dai
X Li
Y Loewenstein
Y Ofran
Yi-Xue Li
Yuan-Yuan Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Learning k-Nearest Neighbor Naive Bayes For Ranking ⋆

Author: Harry Zhang
Jiang Su
Liangxiao Jiang
Publication venue
Publication date
Field of study

Abstract. Accurate probability-based ranking of instances is crucial in many real-world data mining applications. KNN (k-nearest neighbor) [1] has been intensively studied as an effective classification model in decades. However, its performance in ranking is unknown. In this paper, we conduct a systematic study on the ranking performance of KNN. At first, we compare KNN and KNNDW (KNN with distance weighted) to decision trees and naive Bayes in ranking, measured by AUC (the area under the Receiver Operating Characteristics curve). Then, we propose to improve the ranking performance of KNN by combining KNN with naive Bayes (simply NB). The idea is that a naive Bayes is learned using the k nearest neighbors of the test instance as the training data and used to classify the test instance. A critical problem in combining KNN with naive Bayes is the lack of training data when k is small. We propose to deal with it using cloning to expand the training data. That is, each of the k nearest neighbors is “cloned ” and the clones are added to the training data. We call our new model instance cloning local naive Bayes (simply ICLNB). We conduct extensive empirical comparison for the related algorithms in two groups in terms of AUC, using the 36 UCI datasets recommended by Weka[2]. In the first group, we compare ICLNB with KNN, NB, NBTree[3], C4.4[4]. In the second group, we compare ICLNB with KNN, KNNDW and LWNB[5]. Our experimental results show that ICLNB outperforms all those algorithms significantly. From our study, we have two conclusions. First, KNN-relates algorithms performs well in ranking. Second, our new algorithm ICLNB performs best among the algorithms compared in this paper, and could be used in the applications in which an accurate ranking is desired.

CiteSeerX

Augmenting naive bayes for ranking

Author: Harry Zhang
Jiang Su
Liangxiao Jiang
Publication venue: ACM Press
Publication date: 01/01/2005
Field of study

Naive Bayes is an effective and efficient learning algorithm in classification. In many applications, however, an accurate ranking of instances based on the class probability is more desirable. Unfortunately, naive Bayes has been found to produce poor probability estimates. Numerous techniques have been proposed to extend naive Bayes for better classification accuracy, of which selective Bayesian classifiers (SBC) (Langley & Sage

CiteSeerX

Rigorous Non-Disjoint Discretization for Naive Bayes

Author: Jiang Liangxiao
Webb Geoffrey I.
Zhang Huan
Publication venue
Publication date: 01/08/2023
Field of study

Monash University Research Portal

One Dependence Augmented Naive Bayes

Author: Harry Zhang
Jiang Su
Liangxiao Jiang
Zhihua Cai
Publication venue: Springer Press
Publication date
Field of study

Abstract. In real-world data mining applications, an accurate ranking is same important to a accurate classification. Naive Bayes (simply NB) has been widely used in data mining as a simple and effective classification and ranking algorithm. Since its conditional independence assumption is rarely true, numerous algorithms have been proposed to improve Naive Bayes, for example, SBC[1] and TAN[2]. Indeed, the experimental results show that SBC and TAN achieve a significant improvement in term of classification accuracy. However, unfortunately, our experiments also show that SBC and TAN perform even worse than naive Bayes in ranking measured by AUC[3, 4](the area under the Receiver Operating Characteristics curve). This fact raises the question of whether can we improve Naive Bayes with both accurate classification and ranking? In this paper, responding to this question, we present a new learning algorithm called One Dependence Augmented Naive Bayes (simply ODANB). Our motivation is to develop a new algorithm to improve Naive Bayes ’ performance not only on classification measured by accuracy but also on ranking measured by AUC. We experimentally tested our algorithm, using the whole 36 UCI datasets recommended by Weka[5], and compared it to NB, SBC[1] and TAN[2]. The experimental results show that our algorithm outperforms all the other algorithms significantly in yielding accurate ranking, yet at the same time outperforms all the other algorithms slightly in terms of classification accuracy.

CiteSeerX

Learning Tree Augmented Naive Bayes for Ranking

Author: Harry Zhang
Jiang Su
Liangxiao Jiang
Zhihua Cai
Publication venue
Publication date
Field of study

Abstract. Naive Bayes has been widely used in data mining as a simple and effective classification algorithm. Since its conditional independence assumption is rarely true, numerous algorithms have been proposed to improve naive Bayes, among which tree augmented naive Bayes (TAN) [3] achieves a significant improvement in term of classification accuracy, while maintaining efficiency and model simplicity. In many real-world data mining applications, however, an accurate ranking is more desirable than a classification. Thus it is interesting whether TAN also achieves significant improvement in term of ranking, measured by AUC(the area under the Receiver Operating Characteristics curve) [8, 1]. Unfortunately, our experiments show that TAN performs even worse than naive Bayes in ranking. Responding to this fact, we present a novel learning algorithm, called forest augmented naive Bayes (FAN), by modifying the traditional TAN learning algorithm. We experimentally test our algorithm on all the 36 data sets recommended by Weka [12], and compare it to naive Bayes, SBC [6], TAN [3], and C4.4 [10], in terms of AUC. The experimental results show that our algorithm outperforms all the other algorithms significantly in yielding accurate rankings. Our work provides an effective and efficient data mining algorithm for applications in which an accurate ranking is required

CiteSeerX

Attribute Value Weighted Average of One-Dependence Estimators

Author: Dianhong Wang
Liangjun Yu
Liangxiao Jiang
Lungan Zhang
Publication venue: 'MDPI AG'
Publication date: 01/09/2017
Field of study

Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, semi-naive Bayesian classifiers which utilize one-dependence estimators (ODEs) have been shown to be able to approximate the ground-truth attribute dependencies; meanwhile, the probability estimation in ODEs is effective, thus leading to excellent performance. In previous studies, ODEs were exploited directly in a simple way. For example, averaged one-dependence estimators (AODE) weaken the attribute independence assumption by directly averaging all of a constrained class of classifiers. However, all one-dependence estimators in AODE have the same weights and are treated equally. In this study, we propose a new paradigm based on a simple, efficient, and effective attribute value weighting approach, called attribute value weighted average of one-dependence estimators (AVWAODE). AVWAODE assigns discriminative weights to different ODEs by computing the correlation between the different root attribute value and the class. Our approach uses two different attribute value weighting measures: the Kullback–Leibler (KL) measure and the information gain (IG) measure, and thus two different versions are created, which are simply denoted by AVWAODE-KL and AVWAODE-IG, respectively. We experimentally tested them using a collection of 36 University of California at Irvine (UCI) datasets and found that they both achieved better performance than some other state-of-the-art Bayesian classifiers used for comparison

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

A novel one-dependence estimator based on

Author: Dan Zeng
Liangxiao Jiang
Sifa Zhang
Siwei Jiang
Zhihua Cai
Publication venue
Publication date
Field of study

multi-parent

CiteSeerX

Evaluation and comparison of in vitro antioxidant activities of unsaponifiable fraction of 11 kinds of edible vegetable oils

Author: Jiang Jun
Li Hui
Li Peiwu
Wang Sujun
Yang Ruinan
Zhang Liangxiao
Zhang Qi
Publication venue: 'Wiley'
Publication date: 01/01/2018
Field of study

The radical scavenging capabilities of the extracts from eleven edible vegetable oils were investigated by using 2,2‐diphenyl‐1‐picrylhydrazyl (DPPH ), 2,2′‐azino‐bis‐3‐ ethylbenzothiazoline‐6‐sulfonic acid (ABTS ), and ferric reducing ability of plasma (FRAP ) assays. The results indicated that rapeseed oil and sesame oil showed higher radical scavenging abilities than other vegetable oils. When the radical scavenging capabilities of the extracts from virgin camellia oils and commercially available refined camellia oils were evaluated by FRAP assay, the results showed that the antioxidant capabilities of the former were higher than the latter. Therefore, it is recommended that moderate refining processes should be taken to minimize the loss of antioxidant components and people consume virgin oils or less processed edible vegetable oils for higher antioxidant activities

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)