219 research outputs found
Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding
In 1994 Karl Sims showed that computational evolution can produce interesting morphologies that resemble natural organisms. Despite nearly two decades of work since, evolved morphologies are not obviously more complex or natural, and the field seems to have hit a complexity ceiling. One hypothesis for the lack of increased complexity is that most work, including Sims’, evolves morphologies composed of rigid elements, such as solid cubes and cylinders, limiting the design space. A second hypothesis is that the encodings of previous work have been overly regular, not allowing complex regularities with variation. Here we test both hypotheses by evolving soft robots with multiple materials and a powerful generative encoding called a compositional pattern-producing network (CPPN). Robots are selected for locomotion speed. We find that CPPNs evolve faster robots than a direct encoding and that the CPPN morphologies appear more natural. We also find that locomotion performance increases as more materials are added, that diversity of form and behavior can be increased with di↵erent cost functions without stifling performance, and that organisms can be evolved at di↵erent levels of resolution. These findings suggest the ability of generative soft-voxel systems to scale towards evolving a large diversity of complex, natural, multi-material creatures. Our results suggest that future work that combines the evolution of CPPNencoded soft, multi-material robots with modern diversityencouraging techniques could finally enable the creation of creatures far more complex and interesting than those produced by Sims nearly twenty years ago
Linear algebra with transformers
Transformers can learn to perform numerical computations from examples only.
I study nine problems of linear algebra, from basic matrix operations to
eigenvalue decomposition and inversion, and introduce and discuss four encoding
schemes to represent real numbers. On all problems, transformers trained on
sets of random matrices achieve high accuracies (over 90%). The models are
robust to noise, and can generalize out of their training distribution. In
particular, models trained to predict Laplace-distributed eigenvalues
generalize to different classes of matrices: Wigner matrices or matrices with
positive eigenvalues. The reverse is not true.Comment: Transactions in Machine Learning Research (TMLR), October 202
AI Methods in Algorithmic Composition: A Comprehensive Survey
Algorithmic composition is the partial or total automation of the process of music composition
by using computers. Since the 1950s, different computational techniques related to
Artificial Intelligence have been used for algorithmic composition, including grammatical
representations, probabilistic methods, neural networks, symbolic rule-based systems, constraint
programming and evolutionary algorithms. This survey aims to be a comprehensive
account of research on algorithmic composition, presenting a thorough view of the field for
researchers in Artificial Intelligence.This study was partially supported by a grant for the MELOMICS project
(IPT-300000-2010-010) from the Spanish Ministerio de Ciencia e Innovación, and a grant for
the CAUCE project (TSI-090302-2011-8) from the Spanish Ministerio de Industria, Turismo
y Comercio. The first author was supported by a grant for the GENEX project (P09-TIC-
5123) from the Consejería de Innovación y Ciencia de Andalucía
Effective Task Transfer Through Indirect Encoding
An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Often approaches to task transfer focus on transforming the original representation to fit the new task. Such representational transformations are necessary because the target task often requires new state information that was not included in the original representation. In RoboCup Keepaway, changing from the 3 vs. 2 variant of the task to 4 vs. 3 adds state information for each of the new players. In contrast, this dissertation explores the idea that transfer is most effective if the representation is designed to be the same even across different tasks. To this end, (1) the bird’s eye view (BEV) representation is introduced, which can represent different tasks on the same two-dimensional map. Because the BEV represents state information associated with positions instead of objects, it can be scaled to more objects without manipulation. In this way, both the 3 vs. 2 and 4 vs. 3 Keepaway tasks can be represented on the same BEV, which is (2) demonstrated in this dissertation. Yet a challenge for such representation is that a raw two-dimensional map is highdimensional and unstructured. This dissertation demonstrates how this problem is addressed naturally by the Hypercube-based NeuroEvolution of Augmenting Topologies (HyperNEAT) approach. HyperNEAT evolves an indirect encoding, which compresses the representation by exploiting its geometry. The dissertation then explores further exploiting the power of such encoding, beginning by (3) enhancing the configuration of the BEV with a focus on iii modularity. The need for further nonlinearity is then (4) investigated through the addition of hidden nodes. Furthermore, (5) the size of the BEV can be manipulated because it is indirectly encoded. Thus the resolution of the BEV, which is dictated by its size, is increased in precision and culminates in a HyperNEAT extension that is expressed at effectively infinite resolution. Additionally, scaling to higher resolutions through gradually increasing the size of the BEV is explored. Finally, (6) the ambitious problem of scaling from the Keepaway task to the Half-field Offense task is investigated with the BEV. Overall, this dissertation demonstrates that advanced representations in conjunction with indirect encoding can contribute to scaling learning techniques to more challenging tasks, such as the Half-field Offense RoboCup soccer domain
Will they take this offer? A machine learning price elasticity model for predicting upselling acceptance of premium airline seating
Employing customer information from one of the world's largest airline companies, we develop a price elasticity model (PREM) using machine learning to identify customers likely to purchase an upgrade offer from economy to premium class and predict a customer's acceptable price range. A simulation of 64.3 million flight bookings and 14.1 million email offers over three years mirroring actual data indicates that PREM implementation results in approximately 1.12 million (7.94%) fewer non-relevant customer email messages, a predicted increase of 72,200 (37.2%) offers accepted, and an estimated $72.2 million (37.2%) of increased revenue. Our results illustrate the potential of automated pricing information and targeting marketing messages for upselling acceptance. We also identified three customer segments: (1) Never Upgrades are those who never take the upgrade offer, (2) Upgrade Lovers are those who generally upgrade, and (3) Upgrade Lover Lookalikes have no historical record but fit the profile of those that tend to upgrade. We discuss the implications for airline companies and related travel and tourism industries.© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).fi=vertaisarvioitu|en=peerReviewed
Do It Like a Syntactician: Using Binary Gramaticality Judgements to Train Sentence Encoders and Assess Their Sensitivity to Syntactic Structure
The binary nature of grammaticality judgments and their use to access the structure of syntax are a staple of modern linguistics. However, computational models of natural language rarely make use of grammaticality in their training or application. Furthermore, developments in modern neural NLP have produced a myriad of methods that push the baselines in many complex tasks, but those methods are typically not evaluated from a linguistic perspective. In this dissertation I use grammaticality judgements with artificially generated ungrammatical sentences to assess the performance of several neural encoders and propose them as a suitable training target to make models learn specific syntactic rules. I generate artificial ungrammatical sentences via two methods. First by randomly pulling words following the n-gram distribution of a corpus of real sentences (I call these Word salads). Second, by corrupting sentences from a real corpus by altering them (changing verbal or adjectival agreement or removing the main verb). We then train models with an encoder using word embeddings and long short term memory (LSTMs) to discriminate between real sentences and ungrammatical sentences. We show that word salads can be distinguished by the model well for low order n-grams but that the model does not generalize well for higher orders. Furthermore, the word salads do not help the model in recognizing corrupted sentences. We then test the contributions of pre-trained word embeddings, deep LSTM and bidirectional LSTM. We find that the biggest contribution is adding pre-trained word embeddings. We also find that additional layers contribute differently to the performance of unidirectional and bidirectional models and that deeper models have more performance variability across training runs
Methods for Learning Directed and Undirected Graphical Models
Probabilistic graphical models provide a general framework for modeling relationships between multiple random variables. The main tool in this framework is a mathematical object called graph which visualizes the assertions of conditional independence between the variables. This thesis investigates methods for learning these graphs from observational data.
Regarding undirected graphical models, we propose a new scoring criterion for learning a dependence structure of a Gaussian graphical model. The scoring criterion is derived as an approximation to often intractable Bayesian marginal likelihood. We prove that the scoring criterion is consistent and demonstrate its applicability to high-dimensional problems when combined with an efficient search algorithm.
Secondly, we present a non-parametric method for learning undirected graphs from continuous data. The method combines a conditional mutual information estimator with a permutation test in order to perform conditional independence testing without assuming any specific parametric distributions for the involved random variables. Accompanying this test with a constraint-based structure learning algorithm creates a method which performs well in numerical experiments when the data generating mechanisms involve non-linearities.
For directed graphical models, we propose a new scoring criterion for learning Bayesian network structures from discrete data. The criterion approximates a hard-to-compute quantity called the normalized maximum likelihood. We study the theoretical properties of the score and compare it experimentally to popular alternatives. Experiments show that the proposed criterion provides a robust and safe choice for structure learning and prediction over a wide variety of different settings.
Finally, as an application of directed graphical models, we derive a closed form expression for Bayesian network Fisher kernel. This provides us with a similarity measure over discrete data vectors, capable of taking into account the dependence structure between the components. We illustrate the similarity measured by this kernel with an example where we use it to seek sets of observations that are important and representative of the underlying Bayesian network model.Graafiset todennäköisyysmallit ovat yleispätevä tapa mallintaa yhteyksiä usean satunnaismuuttujan välillä. Keskeinen työkalu näissä malleissa on verkko, eli graafi, jolla voidaan visuaalisesti esittää muuttujien välinen riippuvuusrakenne. Tämä väitöskirja käsittelee erilaisia menetelmiä suuntaamattomien ja suunnattujen verkkojen oppimiseen havaitusta aineistosta.
Liittyen suuntaamattomiin verkkoihin, tässä työssä esitellään kaksi erilaisiin tilanteisiin soveltuvaa menetelmää verkkojen rakenteen oppimiseen. Ensiksi esitellään mallinvalintakriteeri, jolla voidaan oppia verkkojen rakenteita muuttujien ollessa normaalijakautuneita. Kriteeri johdetaan approksimaationa usein laskennallisesti vaativalle bayesiläiselle marginaaliuskottavuudelle (marginal likelihood). Työssä tutkitaan kriteerin teoreettisia ominaisuuksia ja näytetään kokeellisesti, että se toimii hyvin tilanteissa, joissa muuttujien määrä on suuri.
Toinen esiteltävä menetelmä on ei-parametrinen, tarkoittaen karkeasti, että emme tarvitse tarkkoja oletuksia syötemuuttujien jakaumasta. Menetelmä käyttää hyväkseen aineistosta estimoitavia informaatioteoreettisia suureita sekä permutaatiotestiä. Kokeelliset tulokset osoittavat, että menetelmä toimii hyvin, kun riippuvuudet syöteaineiston muuttujien välillä ovat epälineaarisia.
Väitöskirjan toinen osa käsittelee Bayes-verkkoja, jotka ovat suunnattuja graafisia malleja. Työssä esitellään uusi mallinvalintakriteeri Bayes-verkkojen oppimiseen diskreeteille muuttujille. Tätä kriteeriä tutkitaan teoreettisesti sekä verrataan kokeellisesti muihin yleisesti käytettyihin mallinvalintakriteereihin.
Väitöskirjassa esitellään viimeisenä sovellus suunnatuille graafisille malleille johtamalla Bayes-verkkoon pohjautuva Fisher-ydin (Fisher kernel). Saatua Fisher-ydintä voidaan käyttää mittaamaan datavektoreiden samankaltaisuutta ottaen huomioon riippuvuudet vektoreiden komponenttien välillä, mitä havainnollistetaan kokeellisesti
- …