6,315 research outputs found
The identification of context-sensitive features: A formal definition of context for concept learning
A large body of research in machine learning is concerned with supervised learning from examples. The examples are typically represented as vectors in a multi- dimensional feature space (also known as attribute-value descriptions). A teacher partitions a set of training examples into a finite number of classes. The task of the learning algorithm is to induce a concept from the training examples. In this paper, we formally distinguish three types of features: primary, contextual, and irrelevant features. We also formally define what it means for one feature to be context-sensitive to another feature. Context-sensitive features complicate the task of the learner and potentially impair the learner's performance. Our formal definitions make it possible for a learner to automatically identify context-sensitive features. After context-sensitive features have been identified, there are several strategies that the learner can employ for managing the features; however, a discussion of these strategies is outside of the scope of this paper. The formal definitions presented here correct a flaw in previously proposed definitions. We discuss the relationship between our work and a formal definition of relevance
Recommended from our members
A Boolean complete neural model of adaptive behavior
A multi-layered neural assembly is developed which has the capability of learning arbitrary Boolean functions. Though the model neuron is more powerful than those previously considered, assemblies of neurons are needed to detect non-linearly separable patterns. Algorithms for learning at the neuron and assembly level are described. The model permits multiple output systens to share a common memory. Learned evaluation allows sequences of actions to be organized. Computer simulations demonstrate the capabilities of the model
A Winnow-Based Approach to Context-Sensitive Spelling Correction
A large class of machine-learning problems in natural language require the
characterization of linguistic context. Two characteristic properties of such
problems are that their feature space is of very high dimensionality, and their
target concepts refer to only a small subset of the features in the space.
Under such conditions, multiplicative weight-update algorithms such as Winnow
have been shown to have exceptionally good theoretical properties. We present
an algorithm combining variants of Winnow and weighted-majority voting, and
apply it to a problem in the aforementioned class: context-sensitive spelling
correction. This is the task of fixing spelling errors that happen to result in
valid words, such as substituting "to" for "too", "casual" for "causal", etc.
We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a
statistics-based method representing the state of the art for this task. We
find: (1) When run with a full (unpruned) set of features, WinSpell achieves
accuracies significantly higher than BaySpell was able to achieve in either the
pruned or unpruned condition; (2) When compared with other systems in the
literature, WinSpell exhibits the highest performance; (3) The primary reason
that WinSpell outperforms BaySpell is that WinSpell learns a better linear
separator; (4) When run on a test set drawn from a different corpus than the
training set was drawn from, WinSpell is better able than BaySpell to adapt,
using a strategy we will present that combines supervised learning on the
training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language
Learning, 1999. 25 page
Recommended from our members
Local search: A guide for the information retrieval practitioner
There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR
- …