162 research outputs found
Solving Multiclass Learning Problems via Error-Correcting Output Codes
Multiclass learning problems involve finding a definition for an unknown
function f(x) whose range is a discrete set containing k > 2 values (i.e., k
``classes''). The definition is acquired by studying collections of training
examples of the form [x_i, f (x_i)]. Existing approaches to multiclass learning
problems include direct application of multiclass algorithms such as the
decision-tree algorithms C4.5 and CART, application of binary concept learning
algorithms to learn individual binary functions for each of the k classes, and
application of binary concept learning algorithms with distributed output
representations. This paper compares these three approaches to a new technique
in which error-correcting codes are employed as a distributed output
representation. We show that these output representations improve the
generalization performance of both C4.5 and backpropagation on a wide range of
multiclass learning tasks. We also demonstrate that this approach is robust
with respect to changes in the size of the training sample, the assignment of
distributed representations to particular classes, and the application of
overfitting avoidance techniques such as decision-tree pruning. Finally, we
show that---like the other methods---the error-correcting code technique can
provide reliable class probability estimates. Taken together, these results
demonstrate that error-correcting output codes provide a general-purpose method
for improving the performance of inductive learning programs on multiclass
problems.Comment: See http://www.jair.org/ for any accompanying file
Recommended from our members
Error-correcting output codes : a general method for improving multiclass inductive learning programs
Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k "classes") . The definition is acquired by studying large collections of training examples of the form (xi, f(xi)) . Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decision-tree algorithms ID3 and CART, (b) application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and (c) application of binary concept learning algorithms with distributed output codes such as those employed by Sejnowski and Rosenberg in the NETtalk system. This paper compares these three approaches to a new technique in which BCH error-correcting codes are employed as a distributed output representation. We show that these output representations improve the performance of ID3 on the NETtalk task and of backpropagation on an isolated-letter speech-recognition task. These results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multi- class problems
Recommended from our members
A comparison of ID3 and backpropagation for English text-to-speech mapping
The performance of the error backpropagation (BP) and 1D3 learning algorithms was com- pared on the task of mapping English text to phonemes and stresses. Under the distributed output code developed by Sejnowski and Rosenberg, it is shown that BP consistently out- performs ID3 on this task by several percentage points. Three hypotheses explaining this difference were explored: (a) ID3 is overfitting the training data, (b) BP is able to share hidden units across several output units and hence can learn the output units better, and (c) BP captures statistical information that 1D3 does not. We conclude that only hypothesis (c) is correct. By augmenting ID3 with a simple statistical learning procedure, the performance of BP can be approached but not matched. More complex statistical procedures can improve the performance of both BP and 1D3 substantially. A study of the residual errors suggests that there is still substantial room for improvement in learning methods for text-to-speech mapping
Recommended from our members
A comparative study of ID3 and backpropagation for English text-to-speech mapping
The performance of the error backpropagation (BP) and ID3 learning algorithms was compared on the task of mapping English text to phonemes and stresses. Under the distributed output code developed by Sejnowski and Rosenberg, it is shown that BP consistently out-performs 1D3 on this task by several percentage points. Three hypotheses explaining this difference were explored: (a) ID3 is overfitting the training data, (b) BP is able to share hidden units across several output units and hence can learn the output units better, and (c) BP captures statistical information that ID3 does not. We conclude that only hypothesis (c) is correct. By augmenting ID3 with a simple statistical learning procedure, the performance of BP can be approached but not matched. More complex
statistical procedures can improve the performance of both BP and ID3 substantially. A study of the residual errors suggests that there is still substantial room for improvement in learning methods for text-to-speech mapping
Post-Partum Pituitary Insufficiency and Livedo Reticularis Presenting a Diagnostic Challenge in a Resource Limited Setting in Tanzania: A Case Report, Clinical Discussion and Brief Review of Existing Literature.
Pituitary disorders following pregnancy are an important yet under reported clinical entity in the developing world. Conversely, post partum panhypopituitarism has a more devastating impact on women in such settings due to high fertility rates, poor obstetric care and scarcity of diagnostic and therapeutic resources available. A 37 year old African female presented ten years post partum with features of multiple endocrine deficiencies including hypothyroidism, hypoadrenalism, lactation failure and secondary amenorrhea. In addition she had clinical features of an underlying autoimmune condition. These included a history of post-partum thyroiditis, alopecia areata, livedo reticularis and deranged coagulation indices. A remarkable clinical response followed appropriate hormone replacement therapy including steroids. This constellation has never been reported before; we therefore present an interesting clinical discussion including a brief review of existing literature. Post partum pituitary insufficiency is an under-reported condition of immense clinical importance especially in the developing world. A high clinical index of suspicion is vital to ensure an early and correct diagnosis which will have a direct bearing on management and patient outcome
Municipal mortality due to thyroid cancer in Spain
BACKGROUND: Thyroid cancer is a tumor with a low but growing incidence in Spain. This study sought to depict its spatial municipal mortality pattern, using the classic model proposed by Besag, York and Mollié. METHODS: It was possible to compile and ascertain the posterior distribution of relative risk on the basis of a single Bayesian spatial model covering all of Spain's 8077 municipal areas. Maps were plotted depicting standardized mortality ratios, smoothed relative risk (RR) estimates, and the posterior probability that RR > 1. RESULTS: From 1989 to 1998 a total of 2,538 thyroid cancer deaths were registered in 1,041 municipalities. The highest relative risks were mostly situated in the Canary Islands, the province of Lugo, the east of La Coruña (Corunna) and western areas of Asturias and Orense. CONCLUSION: The observed mortality pattern coincides with areas in Spain where goiter has been declared endemic. The higher frequency in these same areas of undifferentiated, more aggressive carcinomas could be reflected in the mortality figures. Other unknown genetic or environmental factors could also play a role in the etiology of this tumor
- …