4,391 research outputs found
Zipf law in the popularity distribution of chess openings
We perform a quantitative analysis of extensive chess databases and show that
the frequencies of opening moves are distributed according to a power-law with
an exponent that increases linearly with the game depth, whereas the pooled
distribution of all opening weights follows Zipf's law with universal exponent.
We propose a simple stochastic process that is able to capture the observed
playing statistics and show that the Zipf law arises from the self-similar
nature of the game tree of chess. Thus, in the case of hierarchical
fragmentation the scaling is truly universal and independent of a particular
generating mechanism. Our findings are of relevance in general processes with
composite decisions.Comment: 5 pages, 4 figure
Temporal Difference Learning in Complex Domains
PhDThis thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were
successfully used for backgammon (Tesauro 1994) and applies them to other complex
games that are less amenable to simple pattem-matching approaches. The games
investigated are chess and shogi, both of which (unlike backgammon) require
significant amounts of computational effort to be expended on search in order to
achieve expert play. The improved methods are also tested in a non-game domain.
In the chess domain, the adapted TD(k) method is shown to successfully learn the
relative values of the pieces, and matches using these learnt piece values indicate that
they perform at least as well as piece values widely quoted in elementary chess books.
The adapted TD(X) method is also shown to work well in shogi, considered by many
researchers to be the next challenge for computer game-playing, and for which there
is no standardised set of piece values.
An original method to automatically set and adjust the major control parameters used
by TD(k) is presented. The main performance advantage comes from the learning
rate adjustment, which is based on a new concept called temporal coherence.
Experiments in both chess and a random-walk domain show that the temporal
coherence algorithm produces both faster learning and more stable values than both
human-chosen parameters and an earlier method for learning rate adjustment.
The methods presented in this thesis allow programs to learn with as little input of
external knowledge as possible, exploring the domain on their own rather than by
being taught. Further experiments show that the method is capable of handling many
hundreds of weights, and that it is not necessary to perform deep searches during the
leaming phase in order to learn effective weight
Temoral Difference Learning in Complex Domains
Submitted to the University of London for the Degree of Doctor of Philosophy in Computer Scienc
Spatial-temporal reasoning applications of computational intelligence in the game of Go and computer networks
Spatial-temporal reasoning is the ability to reason with spatial images or information about space over time. In this dissertation, computational intelligence techniques are applied to computer Go and computer network applications. Among four experiments, the first three are related to the game of Go, and the last one concerns the routing problem in computer networks.
The first experiment represents the first training of a modified cellular simultaneous recurrent network (CSRN) trained with cellular particle swarm optimization (PSO). Another contribution is the development of a comprehensive theoretical study of a 2x2 Go research platform with a certified 5 dan Go expert. The proposed architecture successfully trains a 2x2 game tree. The contribution of the second experiment is the development of a computational intelligence algorithm calledcollective cooperative learning (CCL). CCL learns the group size of Go stones on a Go board with zero knowledge by communicating only with the immediate neighbors. An analysis determines the lower bound of a design parameter that guarantees a solution. The contribution of the third experiment is the proposal of a unified system architecture for a Go robot. A prototype Go robot is implemented for the first time in the literature. The last experiment tackles a disruption-tolerant routing problem for a network suffering from link disruption. This experiment represents the first time that the disruption-tolerant routing problem has been formulated with a Markov Decision Process. In addition, the packet delivery rate has been improved under a range of link disruption levels via a reinforcement learning approach --Abstract, page iv
How Hard Is It to Control an Election by Breaking Ties?
We study the computational complexity of controlling the result of an
election by breaking ties strategically. This problem is equivalent to the
problem of deciding the winner of an election under parallel universes
tie-breaking. When the chair of the election is only asked to break ties to
choose between one of the co-winners, the problem is trivially easy. However,
in multi-round elections, we prove that it can be NP-hard for the chair to
compute how to break ties to ensure a given result. Additionally, we show that
the form of the tie-breaking function can increase the opportunities for
control. Indeed, we prove that it can be NP-hard to control an election by
breaking ties even with a two-stage voting rule.Comment: Revised and expanded version including longer proofs and additional
result
- …