Search CORE

14,739 research outputs found

Query expansion with naive bayes for searching distributed collections

Author: Yang Hui
Zhang Minjie
Publication venue
Publication date: 01/01/2002
Field of study

The proliferation of online information resources increases the importance of effective and efficient distributed searching. However, the problem of word mismatch seriously hurts the effectiveness of distributed information retrieval. Automatic query expansion has been suggested as a technique for dealing with the fundamental issue of word mismatch. In this paper, we propose a method - query expansion with Naive Bayes to address the problem, discuss its implementation in IISS system, and present experimental results demonstrating its effectiveness. Such technique not only enhances the discriminatory power of typical queries for choosing the right collections but also hence significantly improves retrieval results

CiteSeerX

Open Research Online (The Open University)

Recommended from our members

Real estate portfolio construction and estimation risk

Author: Lee Stephen
Stevenson Simon
Publication venue: University of Reading
Publication date: 01/01/2000
Field of study

The use of MPT in the construction real estate portfolios has two serious limitations when used in an ex-ante framework: (1) the intertemporal instability of the portfolio weights and (2) the sharp deterioration in performance of the optimal portfolios outside the sample period used to estimate asset mean returns. Both problems can be traced to wide fluctuations in sample means Jorion (1985). Thus the use of a procedure that ignores the estimation risk due to the uncertain in mean returns is likely to produce sub-optimal results in subsequent periods. This suggests that the consideration of the issue of estimation risk is crucial in the use of MPT in developing a successful real estate portfolio strategy. Therefore, following Eun & Resnick (1988), this study extends previous ex-ante based studies by evaluating optimal portfolio allocations in subsequent test periods by using methods that have been proposed to reduce the effect of measurement error on optimal portfolio allocations

Central Archive at the University of Reading

A systematic comparison of supervised classifiers

Author: Amancio D. R.
Bruno O. M.
Casanova D.
Comin C. H.
Costa L. da F.
Rodrigues F. A.
Travieso G.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 16/10/2013
Field of study

Pattern recognition techniques have been employed in a myriad of industrial, medical, commercial and academic applications. To tackle such a diversity of data, many techniques have been devised. However, despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, the consideration of as many as possible techniques presents itself as an fundamental practice in applications aiming at high accuracy. Typical works comparing methods either emphasize the performance of a given algorithm in validation tests or systematically compare various algorithms, assuming that the practical use of these methods is done by experts. In many occasions, however, researchers have to deal with their practical classification tasks without an in-depth knowledge about the underlying mechanisms behind parameters. Actually, the adequate choice of classifiers and parameters alike in such practical circumstances constitutes a long-standing problem and is the subject of the current paper. We carried out a study on the performance of nine well-known classifiers implemented by the Weka framework and compared the dependence of the accuracy with their configuration parameter configurations. The analysis of performance with default parameters revealed that the k-nearest neighbors method exceeds by a large margin the other methods when high dimensional datasets are considered. When other configuration of parameters were allowed, we found that it is possible to improve the quality of SVM in more than 20% even if parameters are set randomly. Taken together, the investigation conducted in this paper suggests that, apart from the SVM implementation, Weka's default configuration of parameters provides an performance close the one achieved with the optimal configuration

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

PubMed Central

Universidade de São Paulo

Evaluation of Machine Learning Algorithms for Intrusion Detection System

Author: Alkasassbeh Mouhammd
Almseidin Mohammad
Alzubi Maen
Kovacs Szilveszter
Publication venue
Publication date: 08/01/2018
Field of study

Intrusion detection system (IDS) is one of the implemented solutions against harmful attacks. Furthermore, attackers always keep changing their tools and techniques. However, implementing an accepted IDS system is also a challenging task. In this paper, several experiments have been performed and evaluated to assess various machine learning classifiers based on KDD intrusion dataset. It succeeded to compute several performance metrics in order to evaluate the selected classifiers. The focus was on false negative and false positive performance metrics in order to enhance the detection rate of the intrusion detection system. The implemented experiments demonstrated that the decision table classifier achieved the lowest value of false negative while the random forest classifier has achieved the highest average accuracy rate

arXiv.org e-Print Archive

Crossref