Search CORE

22 research outputs found

Chebyshev spectral-finite element method for three-dimensional vorticity equation

Author: C. Bernardi
C. Canuto
Guo Benyu
Guo Benyu
He Songnian
J.L. Lions
Ma Heping
Ma Heping
Ma Heping
P.G. Ciarlet
V. Girault
Y. Maday
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Improving text classification using local latent semantic indexing

Author: Benyu Zhang
Gongyi Wu
Tao Liu
Wei-ying Ma
Zheng Chen
Publication venue
Publication date
Field of study

Latent Semantic Indexing (LSI) has been shown to be extremely useful in information retrieval, but it is not an optimal representation for text classification. It always drops the text classification performance when being applied to the whole training set (global LSI) because this completely unsupervised method ignores class discrimination while only concentrating on representation. Some local LSI methods have been proposed to improve the classification by utilizing class discrimination information. However, their performance improvements over original term vectors are still very limited. In this paper, we propose a new local LSI method called “Local Relevancy Weighted LSI ” to improve text classification by performing a separate Single Value Decomposition (SVD) on the transformed local region of each class. Experimental results show that our method is much better than global LSI and traditional local LSI methods on classification within a much smaller LSI dimension. 1

CiteSeerX

Affinity Rank: A New Scheme for Efficient Web Search

Author: Benyu Zhang
Michael R. Lyu
Wei-ying Ma
Yi Liu
Zheng Chen
Publication venue
Publication date
Field of study

Maximizing only the relevance between queries and documents will not satisfy users if they want the top search results to present a wide coverage of topics by a few representative documents. In this paper, we propose two new metrics to evaluate the performance of information retrieval: diversity, which measures the topic coverage of a group of documents, and information richness, which measures the amount of information contained in a document. Then we present a novel ranking scheme, Affinity Rank, which utilizes these two metrics to improve search results. We demonstrate how Affinity Rank works by a toy data set, and verify our method by experiments on real-world data sets

CiteSeerX

Quantitative Analysis and Stability Study on Iridoid Glycosides from Seed Meal of Eucommia ulmoides Oliver

Author: Benyu Liu
Changjian Wang
Huijuan Yu
Lulu Ma
Ning Meng
Shan Huang
Xin Chai
Yuefei Wang
Publication venue: 'MDPI AG'
Publication date: 01/09/2022
Field of study

As a traditional Chinese medicine, Eucommia ulmoides Oliver (E. ulmoides Oliv.) is an important medicinal plant, and its barks, male flowers, leaves, and fruits have high value of utilization. The seed meal of E. ulmoides Oliv. is the waste residue produced after oil extraction from seeds of E. ulmoides Oliv. Though the seed meal of E. ulmoides Oliv. is an ideal feed additive, its medicinal value is far from being developed and utilized. We identified six natural iridoid compounds from the seed meal of E. ulmoides Oliv., namely geniposidic acid (GPA), scyphiphin D (SD), ulmoidoside A (UA), ulmoidoside B (UB), ulmoidoside C (UC), and ulmoidoside D (UD). Six natural iridoid compounds were validated to have anti-inflammatory activities. Hence, six compounds were quantified at the optimum extracting conditions in the seed meal of E. ulmoides Oliv. by an established ultra-performance liquid chromatography (UPLC) method. Some interesting conversion phenomena of six tested compounds were uncovered by a systematic study of stability performed under different temperatures and pH levels. GPA was certified to be stable. SD, UA, and UC were only hydrolyzed under strong alkaline solution. UB and UD were affected by high temperature, alkaline, and strong acid conditions. Our findings reveal the active compounds and explore the quantitative analysis of the tested compounds, contributing to rational utilization for the seeds residues of E. ulmoides Oliv

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

The powerrank web link analysis algorithm

Author: Benyu Zhang
Michael R. Lyu
Wei-ying Ma
Wensi Xi
Yi Liu
Yizhou Lu
Zheng Chen
Publication venue
Publication date: 01/01/2004
Field of study

{byzhang, zhengc

CiteSeerX

Mining ratio rules via principal sparse non-negative matrix factorization

Author: Benyu Zhang
Chenyong Hu
Jun Yan
Qiang Yang
Shuicheng Yan
Wei-ying Ma
Zheng Chen
Publication venue
Publication date: 01/01/2004
Field of study

Association rules are traditionally designed to capture statistical relationship among itemsets in a given database. To additionally capture the quantitative association knowledge, F.Korn et al recently proposed a paradigm named Ratio Rules [4] for quantifiable data mining. However, their approach is mainly based on Principle Component Analysis (PCA) and as a result, it cannot guarantee that the ratio coefficient is non-negative. This may lead to serious problems in the rules’ application. In this paper, we propose a new method, called Principal Sparse Non-Negative Matrix Factoriza-tion (PSNMF), for learning the associations between itemsets in the form of Ratio Rules. In addition, we provide a support measurement to weigh the importance of each rule for the entire dataset. 1

CiteSeerX

Web-page Classification through Summarization

Author: Benyu Zhang
Dou Shen
Hua-Jun Zeng
Qiang Yang
Wei-Ying Ma
Yuchang Lu
Zheng Chen
Publication venue
Publication date: 01/01/2004
Field of study

Web-page classification is much more difficult than pure-text classification due to a large variety of noisy information embedded in Web pages. In this paper, we propose a new Webpage classification algorithm based on Web summarization for improving the accuracy. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then propose a new Web summarization-based classification algorithm and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that our proposed summarization-based classification algorithm achieves an approximately 8.8 % improvement as compared to pure-text-based classification algorithm. We further introduce an ensemble classifier using the improved summarization algorithm and show that it achieves about 12.9 % improvement over pure-text based methods

CiteSeerX

Crossref

Improving web search results using affinity graph

Author: Benyu Zhang
Hua Li
Lei Ji
Wei-ying Ma
Weiguo Fan
Wensi Xi
Yi Liu
Zheng Chen
Publication venue
Publication date: 01/01/2005
Field of study

In this paper, we propose a novel ranking scheme named Affinity Ranking (AR) to re-rank search results by optimizing two metrics: (1) diversity-- which indicates the variance of topics in a group of documents; (2) information richness-- which measures the coverage of a single document to its topic. Both of the two metrics are calculated from a directed link graph named Affinity Graph (AG). AG models the structure of a group of documents based on the asymmetric content similarities between each pair of documents. Experimental results in Yahoo! Directory, ODP Data, and Newsgroup data demonstrate that our proposed ranking algorithm significantly improves the search performance. Specifically, the algorithm achieves 31 % improvement in diversity and 12 % improvement in information richness relatively within the top 10 search results

CiteSeerX

Crossref

Mathematical

Author: Benyu Zhang
Michael R. Lyu
Wei-ying Ma
Wensi Xi
Yi Liu
Yizhou Lu
Zheng Chen
Publication venue
Publication date
Field of study

{byzhang, zhengc

CiteSeerX

Ocfs: Optimal orthogonal centroid feature selection for text categorization

Author: Benyu Zhang
Jun Yan
Ning Liu
Qiansheng Cheng
Shuicheng Yan
Wei-ying Ma
Weiguo Fan
Zheng Chen
Publication venue
Publication date
Field of study

ABSTRACT 1 Text categorization is an important research area in many Information Retrieval (IR) applications. To save the storage space and computation time in text categorization, efficient and effective algorithms for reducing the data before analysis are highly desired. Traditional techniques for this purpose can generally be classified into feature extraction and feature selection. Because of efficiency, the latter is more suitable for text data such as web documents. However, many popular feature selection techniques such as Information Gain (IG) and 2 χ-test (CHI) are all greedy in nature and thus may not be optimal according to some criterion. Moreover, the performance of these greedy methods may be deteriorated when the reserved data dimension is extremely low. In this paper, we propose an efficient optimal feature selection algorithm by optimizing the objective function of Orthogonal Centroid (OC) subspace learning algorithm in a discrete solution space, called Orthogonal Centroid Feature Selection (OCFS). Experiments on 20 Newsgroups (20NG), Reuters Corpus Volume 1 (RCV1) and Open Directory Project (ODP) data show that OCFS is consistently better than IG and CHI with smaller computation time especially when the reduced dimension is extremely small

CiteSeerX