Search CORE

52,909 research outputs found

Textual Case-based Reasoning for Spam Filtering: a Comparison of Feature-based and Feature-free Approaches

Author: Bridge Derek
Delany Sarah Jane
Publication venue: Dublin Institute of Technology
Publication date: 01/10/2006
Field of study

Spam filtering is a text classification task to which Case-Based Reasoning (CBR) has been successfully applied. We describe the ECUE system, which classifies emails using a feature-based form of textual CBR. Then, we describe an alternative way to compute the distances between cases in a feature-free fashion, using a distance measure based on text compression. This distance measure has the advantages of having no set-up costs and being resilient to concept drift. We report an empirical comparison, which shows the feature-free approach to be more accurate than the feature-based system. These results are fairly robust over different compression algorithms in that we find that the accuracy when using a Lempel-Ziv compressor (GZip) is approximately the same as when using a statistical compressor (PPM). We note, however, that the feature-free systems take much longer to classify emails than the feature-based system. Improvements in the classification time of both kinds of systems can be obtained by applying case base editing algorithms, which aim to remove noisy and redundant cases from a case base while maintaining, or even improving, generalisation accuracy. We report empirical results using the Competence-Based Editing (CBE) technique. We show that CBE removes more cases when we use the distance measure based on text compression (without significant changes in generalisation accuracy) than it does when we use the feature-based approach

Arrow@TUDublin

A Minkowski Distance-based Generalisation Method for Improving Centre Loss for Deep Face Recognition

Author: Scotney Bryan
Wan Huan
Wang H.
Wei X
Publication venue: Ulster University
Publication date: 29/08/2018
Field of study

Ulster University's Research Portal

The Visvalingam algorithm: metrics, measures and heuristics

Author: Visvalingam Mahes
Publication venue: 'Informa UK Limited'
Publication date: 31/12/2015
Field of study

This paper provides the background necessary for a clear understanding of forthcoming papers relating to the Visvalingam algorithm for line generalisation, for example on the testing and usage of its implementations. It distinguishes the algorithm from implementation-specific issues to explain why it is possible to get inconsistent but equally valid output from different implementations. By tracing relevant developments within the now-disbanded Cartographic Information Systems Research Group (CISRG) of the University of Hull, it explains why a) a partial metric-driven implementation was, and still is, sufficient for many projects but not for others; b) why the Effective Area (EA) is a measure derived from a metric; c) why this measure (EA) may serve as a heuristic indicator for in-line feature segmentation and model-based generalisation; and, d) how metrics may be combined to change the order of point elimination. The issues discussed in this paper also apply to the use of other metrics. It is hoped that the background and guidance provided in this paper will enable others to participate in further research based on the algorithm

Inductive Visual Localisation: Factorised Training for Superior Generalisation

Author: Gupta Ankush
Vedaldi Andrea
Zisserman Andrew
Publication venue
Publication date: 21/07/2018
Field of study

End-to-end trained Recurrent Neural Networks (RNNs) have been successfully applied to numerous problems that require processing sequences, such as image captioning, machine translation, and text recognition. However, RNNs often struggle to generalise to sequences longer than the ones encountered during training. In this work, we propose to optimise neural networks explicitly for induction. The idea is to first decompose the problem in a sequence of inductive steps and then to explicitly train the RNN to reproduce such steps. Generalisation is achieved as the RNN is not allowed to learn an arbitrary internal state; instead, it is tasked with mimicking the evolution of a valid state. In particular, the state is restricted to a spatial memory map that tracks parts of the input image which have been accounted for in previous steps. The RNN is trained for single inductive steps, where it produces updates to the memory in addition to the desired output. We evaluate our method on two different visual recognition problems involving visual sequences: (1) text spotting, i.e. joint localisation and reading of text in images containing multiple lines (or a block) of text, and (2) sequential counting of objects in aerial images. We show that inductive training of recurrent models enhances their generalisation ability on challenging image datasets.Comment: In BMVC 2018 (spotlight

arXiv.org e-Print Archive