5,095 research outputs found
PonyGE2: Grammatical Evolution in Python
Grammatical Evolution (GE) is a population-based evolutionary algorithm,
where a formal grammar is used in the genotype to phenotype mapping process.
PonyGE2 is an open source implementation of GE in Python, developed at UCD's
Natural Computing Research and Applications group. It is intended as an
advertisement and a starting-point for those new to GE, a reference for
students and researchers, a rapid-prototyping medium for our own experiments,
and a Python workout. As well as providing the characteristic genotype to
phenotype mapping of GE, a search algorithm engine is also provided. A number
of sample problems and tutorials on how to use and adapt PonyGE2 have been
developed.Comment: 8 pages, 4 figures, submitted to the 2017 GECCO Workshop on
Evolutionary Computation Software Systems (EvoSoft
Analysing Symbolic Regression Benchmarks under a Meta-Learning Approach
The definition of a concise and effective testbed for Genetic Programming
(GP) is a recurrent matter in the research community. This paper takes a new
step in this direction, proposing a different approach to measure the quality
of the symbolic regression benchmarks quantitatively. The proposed approach is
based on meta-learning and uses a set of dataset meta-features---such as the
number of examples or output skewness---to describe the datasets. Our idea is
to correlate these meta-features with the errors obtained by a GP method. These
meta-features define a space of benchmarks that should, ideally, have datasets
(points) covering different regions of the space. An initial analysis of 63
datasets showed that current benchmarks are concentrated in a small region of
this benchmark space. We also found out that number of instances and output
skewness are the most relevant meta-features to GP output error. Both
conclusions can help define which datasets should compose an effective testbed
for symbolic regression methods.Comment: 8 pages, 3 Figures, Proceedings of Genetic and Evolutionary
Computation Conference Companion, Kyoto, Japa
Adaptive text mining: Inferring structure from sequences
Text mining is about inferring structure from sequences representing natural language text, and may be defined as the process of analyzing text to extract information that is useful for particular purposes. Although hand-crafted heuristics are a common practical approach for extracting information from text, a general, and generalizable, approach requires adaptive techniques. This paper studies the way in which the adaptive techniques used in text compression can be applied to text mining. It develops several examples: extraction of hierarchical phrase structures from text, identification of keyphrases in documents, locating proper names and quantities of interest in a piece of text, text categorization, word segmentation, acronym extraction, and structure recognition. We conclude that compression forms a sound unifying principle that allows many text mining problems to be tacked adaptively
Recommended from our members
Spring School on Language, Music, and Cognition: Organizing Events in Time
The interdisciplinary spring school “Language, music, and cognition: Organizing events in time” was held from February 26 to March 2, 2018 at the Institute of Musicology of the University of Cologne. Language, speech, and music as events in time were explored from different perspectives including evolutionary biology, social cognition, developmental psychology, cognitive neuroscience of speech, language, and communication, as well as computational and biological approaches to language and music. There were 10 lectures, 4 workshops, and 1 student poster session.
Overall, the spring school investigated language and music as neurocognitive systems and focused on a mechanistic approach exploring the neural substrates underlying musical, linguistic, social, and emotional processes and behaviors. In particular, researchers approached questions concerning cognitive processes, computational procedures, and neural mechanisms underlying the temporal organization of language and music, mainly from two perspectives: one was concerned with syntax or structural representations of language and music as neurocognitive systems (i.e., an intrapersonal perspective), while the other emphasized social interaction and emotions in their communicative function (i.e., an interpersonal perspective). The spring school not only acted as a platform for knowledge transfer and exchange but also generated a number of important research questions as challenges for future investigations
Automatic grammar rule extraction and ranking for definitions
Learning texts contain much implicit knowledge which is ideally presented to the learner in a structured manner - a
typical example being definitions of terms in the text, which would ideally be presented separately as a glossary for
easy access. The problem is that manual extraction of such information can be tedious and time consuming. In this
paper we describe two experiments carried out to enable the automated extraction of definitions from non-technical
learning texts using evolutionary algorithms. A genetic programming approach is used to learn grammatical rules
helpful in discriminating between definitions and non-definitions, after which, a genetic algorithm is used to learn the
relative importance of these features, thus enabling the ranking of candidate sentences in order of confidence. The
results achieved are promising, and we show that it is possible for a Genetic Program to automatically learn similar
rules derived by a human linguistic expert and for a Genetic Algorithm to then give a weighted score to those rules so
as to rank extracted definitions in order of confidence in an effective manner.peer-reviewe
2014 Undergraduate Research Symposium Abstract Book
Abstract book from the 2014 Fourteenth Annual UMM Undergraduate Research Symposium (URS) which celebrates student scholarly achievement and creative activities
Pattern Learning for Detecting Defect Reports and Improvement Requests in App Reviews
Online reviews are an important source of feedback for understanding
customers. In this study, we follow novel approaches that target this absence
of actionable insights by classifying reviews as defect reports and requests
for improvement. Unlike traditional classification methods based on expert
rules, we reduce the manual labour by employing a supervised system that is
capable of learning lexico-semantic patterns through genetic programming.
Additionally, we experiment with a distantly-supervised SVM that makes use of
noisy labels generated by patterns. Using a real-world dataset of app reviews,
we show that the automatically learned patterns outperform the manually created
ones, to be generated. Also the distantly-supervised SVM models are not far
behind the pattern-based solutions, showing the usefulness of this approach
when the amount of annotated data is limited.Comment: Accepted for publication in the 25th International Conference on
Natural Language & Information Systems (NLDB 2020), DFKI Saarbr\"ucken
Germany, June 24-26 202
- …