Search CORE

2,434 research outputs found

Using online linear classifiers to filter spam Emails

Author: Jones Gareth J.F.
Wang Bin
Wenfeng Pan
Publication venue: Springer Verlag
Publication date: 01/11/2007
Field of study

The performance of two online linear classifiers - the Perceptron and Littlestone’s Winnow – is explored for two anti-spam filtering benchmark corpora - PU1 and Ling-Spam. We study the performance for varying numbers of features, along with three different feature selection methods: Information Gain (IG), Document Frequency (DF) and Odds Ratio. The size of the training set and the number of training iterations are also investigated for both classifiers. The experimental results show that both the Perceptron and Winnow perform much better when using IG or DF than using Odds Ratio. It is further demonstrated that when using IG or DF, the classifiers are insensitive to the number of features and the number of training iterations, and not greatly sensitive to the size of training set. Winnow is shown to slightly outperform the Perceptron. It is also demonstrated that both of these online classifiers perform much better than a standard Naïve Bayes method. The theoretical and implementation computational complexity of these two classifiers are very low, and they are very easily adaptively updated. They outperform most of the published results, while being significantly easier to train and adapt. The analysis and promising experimental results indicate that the Perceptron and Winnow are two very competitive classifiers for anti-spam filtering

Irish Universities

DCU Online Research Access Service

Automatic Chinese Postal Address Block Location Using Proximity Descriptors and Cooperative Profit Random Forests.

Author: Dong J.
Dong X.
Sun Jianyuan
Tao D.
Zhou H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/10/2017
Field of study

Locating the destination address block is key to automated sorting of mails. Due to the characteristics of Chinese envelopes used in mainland China, we here exploit proximity cues in order to describe the investigated regions on envelopes. We propose two proximity descriptors encoding spatial distributions of the connected components obtained from the binary envelope images. To locate the destination address block, these descriptors are used together with cooperative profit random forests (CPRFs). Experimental results show that the proposed proximity descriptors are superior to two component descriptors, which only exploit the shape characteristics of the individual components, and the CPRF classifier produces higher recall values than seven state-of-the-art classifiers. These promising results are due to the fact that the proposed descriptors encode the proximity characteristics of the binary envelope images, and the CPRF classifier uses an effective tree node split approach

Queen's University Belfast Research Portal

Crossref

OPUS - University of Technology Sydney

Bournemouth University Research Online

Leicester Research Archive

Recommended from our members

Improving music genre classification using automatically induced harmony rules

Author: Anglade A.
Benetos E.
Dixon S.
Mauch M.
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2010
Field of study

We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates

City Research Online

Crossref

Recommended from our members

Improving music genre classification using automatically induced harmony rules

Author: Amélie Anglade
Aucouturier J.-J.
Cataltepe Z.
Emmanouil Benetos
Fukunaga K.
Lawson C. L.
Matthias Mauch
Piston W.
Pérez-Sancho C.
Quinlan J. R.
Schölkopf B.
Simon Dixon
Tzanetakis G.
van der Hedjen F.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

City Research Online

Crossref

Ghent University Academic Bibliography

University of Miami: Scholarship Miami

The University of Manchester - Institutional Repository

Radboud Repository