3 research outputs found

    Estimation of Distribution Algorithms and Minimum Relative Entropy

    Get PDF
    In the field of optimization using probabilistic models of the search space, this thesis identifies and elaborates several advancements in which the principles of maximum entropy and minimum relative entropy from information theory are used to estimate a probability distribution. The probability distribution within the search space is represented by a graphical model (factorization, Bayesian network or junction tree). An estimation of distribution algorithm (EDA) is an evolutionary optimization algorithm which uses a graphical model to sample a population within the search space and then estimates a new graphical model from the selected individuals of the population. - So far, the Factorized Distribution Algorithm (FDA) builds a factorization or Bayesian network from a given additive structure of the objective function to be optimized using a greedy algorithm which only considers a subset of the variable dependencies. Important connections can be lost by this method. This thesis presents a heuristic subfunction merge algorithm which is able to consider all dependencies between the variables (as long as the marginal distributions of the model do not become too large). On a 2-D grid structure, this algorithm builds a pentavariate factorization which allows to solve the deceptive grid benchmark problem with a much smaller population size than the conventional factorization. Especially for small population sizes, calculating large marginal distributions from smaller ones using Maximum Entropy and iterative proportional fitting leads to a further improvement. - The second topic is the generalization of graphical models to loopy structures. Using the Bethe-Kikuchi approximation, the loopy graphical model (region graph) can learn the Boltzmann distribution of an objective function by a generalized belief propagation algorithm (GBP). It minimizes the free energy, a notion adopted from statistical physics which is equivalent to the relative entropy to the Boltzmann distribution. Previous attempts to combine the Kikuchi approximation with EDA have relied on an expensive Gibbs sampling procedure for generating a population from this loopy probabilistic model. In this thesis a combination with a factorization is presented which allows more efficient sampling. The free energy is generalized to incorporate the inverse temperature ß. The factorization building algorithm mentioned above can be employed here, too. The dynamics of GBP is investigated, and the method is applied on Ising spin glass ground state search. Small instances (7 x 7) are solved without difficulty. Larger instances (10 x 10 and 15 x 15) do not converge to the true optimum with large ß, but sampling from the factorization can find the optimum with about 1000-10000 sampling attempts, depending on the instance. If GBP does not converge, it can be replaced by a concave-convex procedure which guarantees convergence. - Third, if no probabilistic structure is given for the objective function, a Bayesian network can be learned to capture the dependencies in the population. The relative entropy between the population-induced distribution and the Bayesian network distribution is equivalent to the log-likelihood of the model. The log-likelihood has been generalized to the BIC/MDL score which reduces overfitting by punishing complicated structure of the Bayesian network. A previous information theoretic analysis of BIC/MDL in the context of EDA is continued, and empiric evidence is given that the method is able to learn the correct structure of an objective function, given a sufficiently large population. - Finally, a way to reduce the search space of EDA is presented by combining it with a local search heuristics. The Kernighan Lin hillclimber, known originally for the traveling salesman problem and graph bipartitioning, is generalized to arbitrary binary problems. It can be applied in a stand-alone manner, as an iterative 1+1 search algorithm, or combined with EDA. On the MAXSAT problem it performs in a similar scale to the specialized SAT solver Walksat. An analysis of the Kernighan Lin local optima indicates that the combination with an EDA is favorable. The thesis shows how evolutionary optimization can be improved using interdisciplinary results from information theory, statistics, probability calculus and statistical physics. The principles of information theory for estimating probability distributions are applicable in many areas. EDAs are a good application because an improved estimation affects directly the optimization success.Estimation of Distribution Algorithms und Minimierung der relativen Entropie Im Bereich der Optimierung mit probabilistischen Modellen des Suchraums werden einige Fortschritte identifiziert und herausgearbeitet, in denen die Prinzipien der maximalen Entropie und der minimalen relativen Entropie aus der Informationstheorie verwendet werden, um eine Wahrscheinlichkeitsverteilung zu schätzen. Die Wahrscheinlichkeitsverteilung im Suchraum wird durch ein graphisches Modell beschrieben (Faktorisierung, Bayessches Netz oder Verbindungsbaum). Ein Estimation of Distribution Algorithm (EDA) ist ein evolutionärer Optimierungsalgorithmus, der mit Hilfe eines graphischen Modells eine Population im Suchraum erzeugt und dann anhand der selektierten Individuen dieser Population ein neues graphisches Modell erzeugt. - Bislang baut der Factorized Distribution Algorithm (FDA) eine Faktorisierung oder ein Bayessches Netz aus einer gegebenen additiven Struktur der Zielfunktion durch einen Greedy-Algorithmus, der nur einen Teil der Verbindungen zwischen den Variablen berücksichtigt. Wichtige verbindungen können durch diese Methode verloren gehen. Diese Arbeit stellt einen heuristischen Subfunktionenverschmelzungsalgorithmus vor, der in der Lage ist, alle Abhängigkeiten zwischen den Variablen zu berücksichtigen (wofern die Randverteilungen des Modells nicht zu groß werden). Auf einem 2D-Gitter erzeugt dieser Algorithmus eine pentavariate Faktorisierung, die es ermöglicht, das Deceptive-Grid-Testproblem mit viel kleinerer Populationsgröße zu lösen als mit der konventionellen Faktorisierung. Insbesondere für kleine Populationsgrößen kann das Ergebnis noch verbessert werden, wenn große Randverteilungen aus kleineren vermittels des Prinzips der maximalen Entropie und des Iterative Proportional Fitting- Algorithmus berechnet werden. - Das zweite Thema ist die Verallgemeinerung graphischer Modelle zu zirkulären Strukturen. Mit der Bethe-Kikuchi-Approximation kann das zirkuläre graphische Modell (der Regionen-Graph) die Boltzmannverteilung einer Zielfunktion durch einen generalisierten Belief Propagation-Algorithmus (GBP) lernen. Er minimiert die freie Energie, eine Größe aus der statistischen Physik, die äquivalent zur relativen Entropie zur Boltzmannverteilung ist. Frühere Versuche, die Kikuchi-Approximation mit EDA zu verbinden, benutzen einen aufwendigen Gibbs-Sampling-Algorithmus, um eine Population aus dem zirkulären Wahrscheinlichkeitsmodell zu erzeugen. In dieser Arbeit wird eine Verbindung mit Faktorisierungen vorgestellt, die effizienteres Sampling erlaubt. Die freie Energie wird um die inverse Temperatur ß erweitert. Der oben erwähnte Algorithmus zur Erzeugung einer Faktorisierung kann auch hier angewendet werden. Die Dynamik von GBP wird untersucht und auf Ising-Modelle angewendet. Kleine Probleme (7 x 7) werden ohne Schwierigkeit gelöst. Größere Probleme (10 x 10 und 15 x 15) konvergieren mit großem ß nicht mehr zum wahren Optimum, aber durch Sampling von der Faktorisierung kann das Optimum bei einer Samplegröße von 1000 bis 10000, je nach Probleminstanz, gefunden werden. Wenn GBP nicht konvergiert, kann es durch eine Konkav-Konvex-Prozedur ersetzt werden, die Konvergenz garantiert. - Drittens kann, wenn für die Zielfunktion keine Struktur gegeben ist, ein Bayessches Netz gelernt werden, um die Abhängigkeiten in der Population zu erfassen. Die relative Entropie zwischen der Populationsverteilung und der Verteilung durch das Bayessche Netz ist äquivalent zur Log-Likelihood des Modells. Diese wurde erweitert zum BIC/MDL-Kriterium, das Überanpassung lindert, indem komplizierte Strukturen bestraft werden. Eine vorangegangene informationstheoretische Analyse von BIC/MDL im EDA-Bereich wird erweitert, und empirisch wird belegt, daß die Methode die korrekte Struktur einer Zielfunktion bei genügend großer Population lernen kann. - Schließlich wird vorgestellt, wie durch eine lokale Suchheuristik der Suchraum von EDA reduziert werden kann. Der Kernighan-Lin-Hillclimber, der ursprünglich für das Problem des Handlungsreisenden und Graphen-Bipartitionierung konzipiert ist, wird für beliebige binäre Probleme erweitert. Er kann allein angewandt werden, als iteratives 1+1-Suchverfahren, oder in Kombination mit EDA. Er löst das MAXSAT-Problem in ähnlicher Größenordnung wie der spezialisierte Hillclimber Walksat. Eine Analyse der lokalen Optima von Kernighan-Lin zeigt, daß die Kombination mit EDA vorteilhaft ist. Die Arbeit zeigt, wie evolutionäre Optimierung verbessert werden kann, indem interdisziplinäre Ergebnisse aus Informationstheorie, Statistik, Wahrscheinlichkeitsrechnung und statistischer Physik eingebracht werden. Die Prinzipien der Informationstheorie zur Schätzung von Wahrscheinlichkeitsverteilungen lassen sich in vielen Bereichen anwenden. EDAs sind eine gute Anwendung, denn eine verbesserte Schätzung beeinflußt direkt den Optimierungserfolg

    Law, econometrics and statistics

    Get PDF
    Tese (doutorado)—Universidade de Brasília, Faculdade de Direito, Programa de Pós-Graduação em Direito, 2017.A presente tese busca defender do ponto de vista teórico como é necessária a interação entre o Direito de um lado e, de outro, diferentes técnicas quantitativas, tais como Estatística, Econometria, Aprendizado de Máquina, Teoria da Complexidade entre outras possibilidades quantitativas. Dar-se-á especial atenção à Econometria em razão da mesma permitir um debate a respeito do que são e de como se compreendem os fênomenos causais, tão relevantes à avaliação de diversos assuntos jurídicos.Tais técnicas quantitativas podem auxiliar a identificar padrões, tanto padrões pré-empíricos que existem na mente do intérprete, antes dele começar a pensar em como ou no que pesquisar, como padrões empíricos, que podem ser o tema central de pesquisas científicas ou mesmo podem ser objeto de decisões judiciais. Buscar-se-á, ao longo da tese, mostrar como há decisões judiciais, em especial, estrangeiras, que consideram Econometria nos julgamentos importantes. Do ponto de vista empírico, a tese analisou 6.732 decisões proferidas por conselheiros do CADE, entre 2004 e 2014, para verificar o nível do debate Estatístico e Econométrico em tal autarquia, encontrando poucas citações a termos quantitativos. Também, a tese buscou medir se a academia jurídica brasileira utiliza ou não Econometria. Para tanto, obteve-se com base em um robô, programado em phyton, uma população acessível de 381.338 trabalhos acadêmicos (de teses e dissertações) em formato eletrônico na internet. Destes trabalhos, foi sorteada uma amostra aleatória estratificada por ano, por tipo de trabalho e por Universidade, o que resultou em uma amostra de 3.202 trabalhos. A partir de tal amostra, foram contabilizadas quantas vezes apareceram, nas teses e dissertações, 23 termos quantitativos, como, por exemplo, p-valor, hipótese nula, Econometria, Intervalo de Confiança, entre outros. Apenas 2 dos 78 trabalhos jurídicos, selecionados na amostra, fizeram menção a dois termos quantitativos, sendo que nenhum trabalho jurídico chegou a efetivamente realizar uma regressão ou um teste de hipótese estatístico ou econométrico mínimo. Fez-se a nuvem de palavras de todos os trabalhos jurídicos da amostra, para ter uma noção mais ampla de quais são as palavras mais utilizadas no Direito, obtendo-se um elevado nível de autorreferência. Finalmente, pesquisou-se a Jurisprudência dos Tribunais de Justiça Estadual e os Tribunais Regionais Federais e Superiores, a partir do resultado de seus próprios buscadores. Ao digitar o termo “Direito” em tais buscadores, apareceram 14 milhões 674 mil 155 precedentes. Ao repetir a mesma metodologia com o vocábulo “mínimos quadrados”, pôde-se perceber que os mesmos buscadores indicaram apenas 7 precedentes, todos localizados em São Paulo, sendo que a grande maioria destinada a avaliar se o valor pago de indenização em casos de desapropriação do imóvel condizia com o preço médio de mercado. Em todos os precedentes, o Tribunal apenas aceitou o resultado do perito judicial, sem fazer qualquer consideração a respeito da adequação da metodologia de maneira mais aprofundada. O que se busca com a presente tese é mostrar como há outras perspectivas quantitativas que poderiam ser exploradas, para melhorar a qualidade do debate jurídico e social.The present thesis seeks to defend from the theoretical point of view how important is the interaction between Law and different quantitative techniques, such as Statistics, Econometrics, Machine Learning, Theory of Complexity among other quantitative methods. Special attention will be given to Econometrics as it allows a debate about what the causal phenomena are, and how they are understood, which are extremely relevant to the evaluation of various legal matters. Such quantitative techniques can help to identify patterns, both pre-empirical patterns, that exist in the interpreter's mind before he begins to think of how or what he will research, as empirical patterns that may be the central theme of scientific researches or even subject to judicial decisions. The thesis will show how some judicial decisions, especially foreign ones, took in consideration Econometric arguments in some important cases. From an empirical point of view, the thesis analyzed 6,732 decisions made by CADE between 2004 and 2014 to verify the level of the Statistical and Econometric debate in such Agency. The thesis found very few citations to quantitative terms in CADE´s precedent. Also, the thesis sought to measure whether the Brazilian legal academy uses or not Econometrics in its research. In order to achieve this goal, an accessible population of 381,338 academic works (of theses and dissertations) in an electronic format was obtained by the construction of a robot, programmed in phyton, that scrapped internet websites. Therefore, it was possible to draw a stratified random sample by year, by type of work and by University, which resulted in a sample of 3,202 works. From this sample, 23 quantitative terms, such as pvalue, null hypothesis, Econometrics, Confidence Interval, among others, were counted in each of the theses and dissertation. From 3,202 works, only 78 were made by Law students. And only 2 of the 78 legal studies mentioned quantitative terms. However, there were only two quantitative terms mentioned in these legal studies and no legal work ever effectively regressed or tested a minimal statistical or econometric hypothesis. The word cloud of all the juridical works of the sample was made, to have a broader notion of which words are most used in the Law, obtaining a high level of self-reference. Finally, the Jurisprudence of the State and Federal Courts were investigated, based on the results of their own search engines. When entering the term “Direito” (a Portuguese term that could be translated as "Law” or “Right") in such search engines, appeared 14 million 674 thousand 155 precedents. When repeating the same methodology with the word "least square", it was possible to notice that the same search engines indicated only 7 precedents, all located in São Paulo, being the great majority destined to evaluate if the amount paid of indemnification in cases of expropriation of the property was in line with the average market price. In all the foregoing 7 cases, the Court accepted the outcome of the judicial expert, without any consideration being given to the suitability of the methodology in more depth. The aim of this thesis is to show how there are other quantitative perspectives that could be explored to improve the quality of the legal and social debate

    Stochastic Analysis of Cellular Automata and the Voter Model

    No full text
    We make a stochastic analysis of both deterministic and stochastic cellular automata. The theory uses a mesoscopic view, i.e. it works with probabilities instead of individual configurations used in micro-simulations. We make an exact analysis by using the theory of Markov processes. This can be done for small problems only. For larger problems we approximate the distribution by products of marginal distributions of low order. The approximation use new developments in efficient computation of probabilities based on factorizations of the distribution. We investigate the popular voter model. We show that for one dimension the bifurcation at alpha = 1/3 is an artifact of the mean-field approximation
    corecore