9,549 research outputs found
Generalizing backdoors
Abstract. A powerful intuition in the design of search methods is that one wants to proactively select variables that simplify the problem instance as much as possible when these variables are assigned values. The notion of “Backdoor ” variables follows this intuition. In this work we generalize Backdoors in such a way to allow more general classes of sub-solvers, both complete and heuristic. In order to do so, Pseudo-Backdoors and Heuristic-Backdoors are formally introduced and then applied firstly to a simple Multiple Knapsack Problem and secondly to a complex combinatorial optimization problem in the area of stochastic inventory control. Our preliminary computational experience shows the effectiveness of these approaches that are able to produce very low run times and — in the case of Heuristic-Backdoors — high quality solutions by employing very simple heuristic rules such as greedy local search strategies.
A Field Guide to Genetic Programming
xiv, 233 p. : il. ; 23 cm.Libro ElectrĂłnicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction --
Representation, initialisation and operators in Tree-based GP --
Getting ready to run genetic programming --
Example genetic programming run --
Alternative initialisations and operators in Tree-based GP --
Modular, grammatical and developmental Tree-based GP --
Linear and graph genetic programming --
Probalistic genetic programming --
Multi-objective genetic programming --
Fast and distributed genetic programming --
GP theory and its applications --
Applications --
Troubleshooting GP --
Conclusions.Contents
xi
1 Introduction
1.1 Genetic Programming in a Nutshell
1.2 Getting Started
1.3 Prerequisites
1.4 Overview of this Field Guide I
Basics
2 Representation, Initialisation and GP
2.1 Representation
2.2 Initialising the Population
2.3 Selection
2.4 Recombination and Mutation Operators in Tree-based
3 Getting Ready to Run Genetic Programming 19
3.1 Step 1: Terminal Set 19
3.2 Step 2: Function Set 20
3.2.1 Closure 21
3.2.2 Sufficiency 23
3.2.3 Evolving Structures other than Programs 23
3.3 Step 3: Fitness Function 24
3.4 Step 4: GP Parameters 26
3.5 Step 5: Termination and solution designation 27
4 Example Genetic Programming Run
4.1 Preparatory Steps 29
4.2 Step-by-Step Sample Run 31
4.2.1 Initialisation 31
4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming
5 Alternative Initialisations and Operators in
5.1 Constructing the Initial Population
5.1.1 Uniform Initialisation
5.1.2 Initialisation may Affect Bloat
5.1.3 Seeding
5.2 GP Mutation
5.2.1 Is Mutation Necessary?
5.2.2 Mutation Cookbook
5.3 GP Crossover
5.4 Other Techniques 32
5.5 Tree-based GP 39
6 Modular, Grammatical and Developmental Tree-based GP 47
6.1 Evolving Modular and Hierarchical Structures 47
6.1.1 Automatically Defined Functions 48
6.1.2 Program Architecture and Architecture-Altering 50
6.2 Constraining Structures 51
6.2.1 Enforcing Particular Structures 52
6.2.2 Strongly Typed GP 52
6.2.3 Grammar-based Constraints 53
6.2.4 Constraints and Bias 55
6.3 Developmental Genetic Programming 57
6.4 Strongly Typed Autoconstructive GP with PushGP 59
7 Linear and Graph Genetic Programming 61
7.1 Linear Genetic Programming 61
7.1.1 Motivations 61
7.1.2 Linear GP Representations 62
7.1.3 Linear GP Operators 64
7.2 Graph-Based Genetic Programming 65
7.2.1 Parallel Distributed GP (PDGP) 65
7.2.2 PADO 67
7.2.3 Cartesian GP 67
7.2.4 Evolving Parallel Programs using Indirect Encodings 68
8 Probabilistic Genetic Programming
8.1 Estimation of Distribution Algorithms 69
8.2 Pure EDA GP 71
8.3 Mixing Grammars and Probabilities 74
9 Multi-objective Genetic Programming 75
9.1 Combining Multiple Objectives into a Scalar Fitness Function 75
9.2 Keeping the Objectives Separate 76
9.2.1 Multi-objective Bloat and Complexity Control 77
9.2.2 Other Objectives 78
9.2.3 Non-Pareto Criteria 80
9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80
9.4 Multi-objective Optimisation via Operator Bias 81
10 Fast and Distributed Genetic Programming 83
10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83
10.2 Reducing Cost of Fitness with Caches 86
10.3 Parallel and Distributed GP are Not Equivalent 88
10.4 Running GP on Parallel Hardware 89
10.4.1 Master–slave GP 89
10.4.2 GP Running on GPUs 90
10.4.3 GP on FPGAs 92
10.4.4 Sub-machine-code GP 93
10.5 Geographically Distributed GP 93
11 GP Theory and its Applications 97
11.1 Mathematical Models 98
11.2 Search Spaces 99
11.3 Bloat 101
11.3.1 Bloat in Theory 101
11.3.2 Bloat Control in Practice 104
III
Practical Genetic Programming
12 Applications
12.1 Where GP has Done Well
12.2 Curve Fitting, Data Modelling and Symbolic Regression
12.3 Human Competitive Results – the Humies
12.4 Image and Signal Processing
12.5 Financial Trading, Time Series, and Economic Modelling
12.6 Industrial Process Control
12.7 Medicine, Biology and Bioinformatics
12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii
12.9 Entertainment and Computer Games 127
12.10The Arts 127
12.11Compression 128
13 Troubleshooting GP
13.1 Is there a Bug in the Code?
13.2 Can you Trust your Results?
13.3 There are No Silver Bullets
13.4 Small Changes can have Big Effects
13.5 Big Changes can have No Effect
13.6 Study your Populations
13.7 Encourage Diversity
13.8 Embrace Approximation
13.9 Control Bloat
13.10 Checkpoint Results
13.11 Report Well
13.12 Convince your Customers
14 Conclusions
Tricks of the Trade
A Resources
A.1 Key Books
A.2 Key Journals
A.3 Key International Meetings
A.4 GP Implementations
A.5 On-Line Resources 145
B TinyGP 151
B.1 Overview of TinyGP 151
B.2 Input Data Files for TinyGP 153
B.3 Source Code 154
B.4 Compiling and Running TinyGP 162
Bibliography 167
Inde
Principles and Concepts of Agent-Based Modelling for Developing Geospatial Simulations
The aim of this paper is to outline fundamental concepts and principles of the Agent-Based Modelling (ABM) paradigm, with particular reference to the development of geospatial simulations. The paper begins with a brief definition of modelling, followed by a classification of model types, and a comment regarding a shift (in certain circumstances) towards modelling systems at the individual-level. In particular, automata approaches (e.g. Cellular Automata, CA, and ABM) have been particularly popular, with ABM moving to the fore. A definition of agents and agent-based models is given; identifying their advantages and disadvantages, especially in relation to geospatial modelling. The potential use of agent-based models is discussed, and how-to instructions for developing an agent-based model are provided. Types of simulation / modelling systems available for ABM are defined, supplemented with criteria to consider before choosing a particular system for a modelling endeavour. Information pertaining to a selection of simulation / modelling systems (Swarm, MASON, Repast, StarLogo, NetLogo, OBEUS, AgentSheets and AnyLogic) is provided, categorised by their licensing policy (open source, shareware / freeware and proprietary systems). The evaluation (i.e. verification, calibration, validation and analysis) of agent-based models and their output is examined, and noteworthy applications are discussed.Geographical Information Systems (GIS) are a particularly useful medium for representing model input and output of a geospatial nature. However, GIS are not well suited to dynamic modelling (e.g. ABM). In particular, problems of representing time and change within GIS are highlighted. Consequently, this paper explores the opportunity of linking (through coupling or integration / embedding) a GIS with a simulation / modelling system purposely built, and therefore better suited to supporting the requirements of ABM. This paper concludes with a synthesis of the discussion that has proceeded. The aim of this paper is to outline fundamental concepts and principles of the Agent-Based Modelling (ABM) paradigm, with particular reference to the development of geospatial simulations. The paper begins with a brief definition of modelling, followed by a classification of model types, and a comment regarding a shift (in certain circumstances) towards modelling systems at the individual-level. In particular, automata approaches (e.g. Cellular Automata, CA, and ABM) have been particularly popular, with ABM moving to the fore. A definition of agents and agent-based models is given; identifying their advantages and disadvantages, especially in relation to geospatial modelling. The potential use of agent-based models is discussed, and how-to instructions for developing an agent-based model are provided. Types of simulation / modelling systems available for ABM are defined, supplemented with criteria to consider before choosing a particular system for a modelling endeavour. Information pertaining to a selection of simulation / modelling systems (Swarm, MASON, Repast, StarLogo, NetLogo, OBEUS, AgentSheets and AnyLogic) is provided, categorised by their licensing policy (open source, shareware / freeware and proprietary systems). The evaluation (i.e. verification, calibration, validation and analysis) of agent-based models and their output is examined, and noteworthy applications are discussed.Geographical Information Systems (GIS) are a particularly useful medium for representing model input and output of a geospatial nature. However, GIS are not well suited to dynamic modelling (e.g. ABM). In particular, problems of representing time and change within GIS are highlighted. Consequently, this paper explores the opportunity of linking (through coupling or integration / embedding) a GIS with a simulation / modelling system purposely built, and therefore better suited to supporting the requirements of ABM. This paper concludes with a synthesis of the discussion that has proceeded
Improving National and Homeland Security through a proposed Laboratory for Information Globalization and Harmonization Technologies (LIGHT)
A recent National Research Council study found that: "Although there are many private and public databases that
contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions
(i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and
timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this
project. Improved access and use of information are essential to better identify and anticipate threats, protect
against and respond to threats, and enhance national and homeland security (NHS), as well as other national
priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and
Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information
Globalization and Harmonization Technologies (LIGHT) with two interrelated goals:
(1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for
improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic
differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple
autonomous sources, and the use of that information by public and private agencies involved in national and
homeland security and the other national priority areas involving complex and interdependent social systems (soc).
This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration
of diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms,
and wrapper technologies to overcome information representational conflicts. The COIN approach makes it
substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit
distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information
coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of
multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an
environment of changing source and receiver context - which will lead to an effective and novel distributed
information grid infrastructure. This research also builds on our Global System for Sustainable Development
(GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions,
languages, and epistemologies relevant to international relations and national security.
(2) National Priority Studies: To experiment with and test the developed theory and technologies on practical
problems of data integration in national priority areas. Particular focus will be on national and homeland security,
including data sources about conflict and war, modes of instability and threat, international and regional
demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response.
Although LIGHT will leverage the results of our successful prior research projects, this will be the first research
effort to simultaneously and effectively address ontological and temporal information conflicts as well as
dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing
complex environments requires extraction of observations from disparate sources, using different interpretations, at
different points in times, for different purposes, with different biases, and for a wide range of different uses and
users. This research will focus on integrating information both over individual domains and across multiple domains.
Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which
applications in a common domain can share, analyze, modify, and develop information. Applications also can span
multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the
organization and management of such large scale international and diverse research projects.
The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social
Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and
religious backgrounds. The currently identified external collaborators come from over 20 different organizations
and many different countries, industrial as well as developing. Specific efforts are proposed to engage even more
women, underrepresented minorities, and persons with disabilities.
The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and
resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and
research institutions, (b) business and industry, and (c) national and international agencies. Research products
include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in
research and education which are anticipated to significantly impact the way complex organizations, and society in
general, understand and manage critical challenges in NHS, ECS, and ASE
Improving National and Homeland Security through a proposed Laboratory for nformation Globalization and Harmonization Technologies (LIGHT)
A recent National Research Council study found that: "Although there are many private and public databases that
contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions
(i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and
timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this
project. Improved access and use of information are essential to better identify and anticipate threats, protect
against and respond to threats, and enhance national and homeland security (NHS), as well as other national
priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and
Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information
Globalization and Harmonization Technologies (LIGHT) with two interrelated goals:
(1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for
improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic
differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple
autonomous sources, and the use of that information by public and private agencies involved in national and
homeland security and the other national priority areas involving complex and interdependent social systems (soc).
This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration of
diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms,
and wrapper technologies to overcome information representational conflicts. The COIN approach makes it
substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit
distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information
coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of
multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an
environment of changing source and receiver context - which will lead to an effective and novel distributed
information grid infrastructure. This research also builds on our Global System for Sustainable Development
(GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions,
languages, and epistemologies relevant to international relations and national security.
(2) National Priority Studies: To experiment with and test the developed theory and technologies on practical
problems of data integration in national priority areas. Particular focus will be on national and homeland security,
including data sources about conflict and war, modes of instability and threat, international and regional
demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response.
Although LIGHT will leverage the results of our successful prior research projects, this will be the first research
effort to simultaneously and effectively address ontological and temporal information conflicts as well as
dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing
complex environments requires extraction of observations from disparate sources, using different interpretations, at
different points in times, for different purposes, with different biases, and for a wide range of different uses and
users. This research will focus on integrating information both over individual domains and across multiple domains.
Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which
applications in a common domain can share, analyze, modify, and develop information. Applications also can span
multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the
organization and management of such large scale international and diverse research projects.
The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social
Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and
religious backgrounds. The currently identified external collaborators come from over 20 different organizations and
many different countries, industrial as well as developing. Specific efforts are proposed to engage even more
women, underrepresented minorities, and persons with disabilities.
The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and
resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and
research institutions, (b) business and industry, and (c) national and international agencies. Research products
include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in
research and education which are anticipated to significantly impact the way complex organizations, and society in
general, understand and manage critical challenges in NHS, ECS, and ASE
Road network equilibrium approaches to environmental sustainability
Environmental sustainability is closely related to transportation, especially to the road network, because vehicle emissions and noise damage the environment and have adverse effects on human health. It is, therefore, important to take their effect into account when designing and managing road networks. Road network equilibrium approaches have been used to estimate this impact and to design and manage road networks accordingly. However, no comprehensive review has summarized the applications of these approaches to the design and management of road networks that explicitly address environmental concerns. More importantly, it is necessary to identify this gap in the literature so that future research can improve the existing methodologies. Hence, this paper summarizes these applications and identifies potential future research directions in terms of theories, modelling approaches, algorithms, analyses, and applications.postprin
- …