Search CORE

4,601 research outputs found

A weighted nearest neighbor algorithm for learning with symbolic features

Author: B.W. Mathews
C. Stanfill
D. Aha
D. Aha
D. Aha
D. Aha
D. Fisher
D. Medin
D. Rumelhart
D. Rumelhart
D. Rumelhart
D. Waltz
F. Cohen
F. Crick
F. Preparata
G. Towell
J. Garnier
J. McClelland
J. Shavlik
L. Holley
M. O'Neill
N. Qian
P. Chou
R. Lathrop
R. Mooney
R. Nosofsky
S. Fertig
S. Hanson
S. Reed
S. Salzberg
S. Salzberg
S. Salzberg
S. Weiss
T. Cover
T. Dietterich
T. Sejnowski
V. Lim
W. Kabsch
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning

Author: Mooney Raymond J.
Publication venue
Publication date: 01/01/1996
Field of study

This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX

k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)

Author: Cunningham Padraig
Delany Sarah Jane
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/04/2020
Field of study

Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN

arXiv.org e-Print Archive

Arrow@TUDublin

Memory-Based Learning: Using Similarity for Smoothing

Author: Daelemans Walter
Zavrel Jakub
Publication venue
Publication date: 01/01/1997
Field of study

This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.Comment: 8 pages, uses aclap.sty, To appear in Proc. ACL/EACL 9

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Recommended from our members

Instance-based prediction of real-valued attributes

Author: Aha David W.
Kibler Dennis
Publication venue: eScholarship, University of California
Publication date: 24/03/1988
Field of study

Instance-based representations have been applied to numerous classification tasks with a fair amount of success. These tasks predict a symbolic class based on observed attributes. This paper presents a method for predicting a numeric value based on observed attributes. We prove that if the numeric values are generated by continuous functions with bounded slope, then the predicted values are accurate approximations of the actual values. We demonstrate the utility of this approach by comparing it with standard approaches for value-prediction. The approach requires no background knowledge

eScholarship - University of California

Do not forget: Full memory in memory-based learning of word pronunciation

Author: Bosch Antal van den
Daelemans Walter
Publication venue
Publication date: 01/01/1998
Field of study

Memory-based learning, keeping full memory of learning material, appears a viable approach to learning NLP tasks, and is often superior in generalisation accuracy to eager learning approaches that abstract from learning material. Here we investigate three partial memory-based learning approaches which remove from memory specific task instance types estimated to be exceptional. The three approaches each implement one heuristic function for estimating exceptionality of instance types: (i) typicality, (ii) class prediction strength, and (iii) friendly-neighbourhood size. Experiments are performed with the memory-based learning algorithm IB1-IG trained on English word pronunciation. We find that removing instance types with low prediction strength (ii) is the only tested method which does not seriously harm generalisation accuracy. We conclude that keeping full memory of types rather than tokens, and excluding minority ambiguities appear to be the only performance-preserving optimisations of memory-based learning.Comment: uses conll98, epsf, and ipamacs (WSU IPA

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians

Author: Garrido M. C.
Lopez-de-Teruel P. E.
Ruiz A.
Publication venue: 'AI Access Foundation'
Publication date: 18/05/2011
Field of study

This paper presents a general and efficient framework for probabilistic inference and learning from arbitrary uncertain information. It exploits the calculation properties of finite mixture models, conjugate families and factorization. Both the joint probability density of the variables and the likelihood function of the (objective or subjective) observation are approximated by a special mixture model, in such a way that any desired conditional distribution can be directly obtained without numerical integration. We have developed an extended version of the expectation maximization (EM) algorithm to estimate the parameters of mixture models from uncertain training examples (indirect observations). As a consequence, any piece of exact or uncertain information about both input and output values is consistently handled in the inference and learning stages. This ability, extremely useful in certain situations, is not found in most alternative methods. The proposed framework is formally justified from standard probabilistic principles and illustrative examples are provided in the fields of nonparametric pattern classification, nonlinear regression and pattern completion. Finally, experiments on a real application and comparative results over standard databases provide empirical evidence of the utility of the method in a wide range of applications

arXiv.org e-Print Archive

Crossref

Rule-based Machine Learning Methods for Functional Prediction

Author: Indurkhya N.
Weiss S. M.
Publication venue
Publication date: 01/01/1995
Field of study

We describe a machine learning method for predicting the value of a real-valued function, given the values of multiple input variables. The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules. A central objective of the method and representation is the induction of compact, easily interpretable solutions. This rule-based decision model can be extended to search efficiently for similar cases prior to approximating function values. Experimental results on real-world data demonstrate that the new techniques are competitive with existing machine learning and statistical methods and can sometimes yield superior regression performance.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX