26,967 research outputs found
Morphological Analysis as Classification: an Inductive-Learning Approach
Morphological analysis is an important subtask in text-to-speech conversion,
hyphenation, and other language engineering tasks. The traditional approach to
performing morphological analysis is to combine a morpheme lexicon, sets of
(linguistic) rules, and heuristics to find a most probable analysis. In
contrast we present an inductive learning approach in which morphological
analysis is reformulated as a segmentation task. We report on a number of
experiments in which five inductive learning algorithms are applied to three
variations of the task of morphological analysis. Results show (i) that the
generalisation performance of the algorithms is good, and (ii) that the lazy
learning algorithm IB1-IG performs best on all three tasks. We conclude that
lazy learning of morphological analysis as a classification task is indeed a
viable approach; moreover, it has the strong advantages over the traditional
approach of avoiding the knowledge-acquisition bottleneck, being fast and
deterministic in learning and processing, and being language-independent.Comment: 11 pages, 5 encapsulated postscript figures, uses non-standard NeMLaP
proceedings style nemlap.sty; inputs ipamacs (international phonetic
alphabet) and epsf macro
SkILL - a Stochastic Inductive Logic Learner
Probabilistic Inductive Logic Programming (PILP) is a rel- atively unexplored
area of Statistical Relational Learning which extends classic Inductive Logic
Programming (ILP). This work introduces SkILL, a Stochastic Inductive Logic
Learner, which takes probabilistic annotated data and produces First Order
Logic theories. Data in several domains such as medicine and bioinformatics
have an inherent degree of uncer- tainty, that can be used to produce models
closer to reality. SkILL can not only use this type of probabilistic data to
extract non-trivial knowl- edge from databases, but it also addresses
efficiency issues by introducing a novel, efficient and effective search
strategy to guide the search in PILP environments. The capabilities of SkILL
are demonstrated in three dif- ferent datasets: (i) a synthetic toy example
used to validate the system, (ii) a probabilistic adaptation of a well-known
biological metabolism ap- plication, and (iii) a real world medical dataset in
the breast cancer domain. Results show that SkILL can perform as well as a
deterministic ILP learner, while also being able to incorporate probabilistic
knowledge that would otherwise not be considered
Building Machines That Learn and Think Like People
Recent progress in artificial intelligence (AI) has renewed interest in
building systems that learn and think like people. Many advances have come from
using deep neural networks trained end-to-end in tasks such as object
recognition, video games, and board games, achieving performance that equals or
even beats humans in some respects. Despite their biological inspiration and
performance achievements, these systems differ from human intelligence in
crucial ways. We review progress in cognitive science suggesting that truly
human-like learning and thinking machines will have to reach beyond current
engineering trends in both what they learn, and how they learn it.
Specifically, we argue that these machines should (a) build causal models of
the world that support explanation and understanding, rather than merely
solving pattern recognition problems; (b) ground learning in intuitive theories
of physics and psychology, to support and enrich the knowledge that is learned;
and (c) harness compositionality and learning-to-learn to rapidly acquire and
generalize knowledge to new tasks and situations. We suggest concrete
challenges and promising routes towards these goals that can combine the
strengths of recent neural network advances with more structured cognitive
models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary
proposals (until Nov. 22, 2016).
https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
Inducing Probabilistic Grammars by Bayesian Model Merging
We describe a framework for inducing probabilistic grammars from corpora of
positive samples. First, samples are {\em incorporated} by adding ad-hoc rules
to a working grammar; subsequently, elements of the model (such as states or
nonterminals) are {\em merged} to achieve generalization and a more compact
representation. The choice of what to merge and when to stop is governed by the
Bayesian posterior probability of the grammar given the data, which formalizes
a trade-off between a close fit to the data and a default preference for
simpler models (`Occam's Razor'). The general scheme is illustrated using three
types of probabilistic grammars: Hidden Markov models, class-based -grams,
and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second
International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13
page
- …