5 research outputs found
Policy-Gradient Algorithms for Partially Observable Markov Decision Processes
Partially observable Markov decision processes are interesting because of their ability to model most conceivable real-world learning problems, for example, robot navigation, driving a car, speech recognition, stock trading, and playing games. The downside of this generality is that exact algorithms are computationally intractable. Such computational complexity motivates approximate approaches. One such class of algorithms are the so-called policy-gradient methods from reinforcement learning. They seek to adjust the parameters of an agent in the direction that maximises the long-term average of a reward signal. Policy-gradient methods are attractive as a \emph{scalable} approach for controlling partially observable Markov decision processes (POMDPs). In the most general case POMDP policies require some form of internal state, or memory, in order to act optimally. Policy-gradient methods have shown promise for problems admitting memory-less policies but have been less successful when memory is required. This thesis develops several improved algorithms for learning policies with memory in an infinite-horizon setting. Directly, when the dynamics of the world are known, and via Monte-Carlo methods otherwise. The algorithms simultaneously learn how to act and what to remember. ..
Policy-Gradient Algorithms for Partially Observable Markov Decision Processes
Partially observable Markov decision processes are interesting because of their ability to model most conceivable real-world learning problems, for example, robot navigation, driving a car, speech recognition, stock trading, and playing games. The downside of this generality is that exact algorithms are computationally intractable. Such computational complexity motivates approximate approaches. One such class of algorithms are the so-called policy-gradient methods from reinforcement learning. They seek to adjust the parameters of an agent in the direction that maximises the long-term average of a reward signal. Policy-gradient methods are attractive as a \emph{scalable} approach for controlling partially observable Markov decision processes (POMDPs). In the most general case POMDP policies require some form of internal state, or memory, in order to act optimally. Policy-gradient methods have shown promise for problems admitting memory-less policies but have been less successful when memory is required. This thesis develops several improved algorithms for learning policies with memory in an infinite-horizon setting. Directly, when the dynamics of the world are known, and via Monte-Carlo methods otherwise. The algorithms simultaneously learn how to act and what to remember. ..
Development of algorithms for digital real time image processing on a DSP Processor
Rozpoznávanie tvárĂ je komplexnĂ˝ proces, ktorĂ©ho hlavnĂ˝m cieĹľom je rozpoznanie Ĺľudskej tváre v obrázku alebo vo video sekvencii. NajÄŤastejšĂmi aplikáciami sĂş sledovacie a identifikaÄŤnĂ© systĂ©my. TaktieĹľ je rozpoznávanie tvárĂ dĂ´leĹľitĂ© vo vĂ˝skume poÄŤĂtaÄŤovĂ©ho videnia a umelej inteligenciĂ. SystĂ©my rozpoznávania tvárĂ sĂş ÄŤasto zaloĹľenĂ© na analĂ˝ze obrazu alebo na neurĂłnovĂ˝ch sieĹĄach. Táto práca sa zaoberá implementáciou algoritmu zaloĹľenĂ©ho na takzvanĂ˝ch „Eigenfaces“ tvárach. „Eigenfaces“ tváre sĂş vĂ˝sledkom AnalĂ˝zy hlavnĂ˝ch komponent (Principal Component Analysis - PCA), ktorá extrahuje najdĂ´leĹľitejšie tvárovĂ© ÄŤrty z originálneho obrázku. Táto metĂłda je zaloĹľená na riešenĂ lineárnej maticovej rovnice, kde zo známej kovarianÄŤnej matice sa poÄŤĂtajĂş takzvanĂ© „eigenvalues“ a „eigenvectors“, v preklade vlastnĂ© hodnoty a vlastnĂ© vektory. Tvár, ktorá má byĹĄ rozpoznaná, sa premietne do takzvanĂ©ho „eigenspace“ (priestor vlastnĂ˝ch hodnĂ´t). VlastnĂ© rozpoznanie je na základe porovnania takĂ˝chto tvárĂ s existujĂşcou databázou tvárĂ, ktorá je premietnutá do rovnakĂ©ho „eigenspace“. Pred procesom rozpoznávania tvárĂ, musĂ byĹĄ tvár lokalizovaná v obrázku a upravená (normalizácia, kompenzácia svetelnĂ˝ch podmienok a odstránenie šumu). Existuje mnoho algoritmov na lokalizáciu tváre, ale v tejto práci je pouĹľitĂ˝ algoritmus lokalizácie tváre na základe farby Ĺľudskej pokoĹľky, ktorĂ˝ je rĂ˝chly a postaÄŤujĂşci pre tĂşto aplikáciu. Algoritmy rozpoznávania tváre a lokalizácie tváre sĂş implementovanĂ© do DSP procesoru Blackfin ADSP-BF561 od Analog Devices.Face recognition is a complex process that aims to recognize human faces in images or video sequences. Applications include surveillance and identification system, but face recognition is also invaluable in the research of computer vision and artificial intelligence. Face recognition systems are often based on either image analysis or neural networks. This work implements an algorithm based around the use of so-called eigenfaces. Eigenfaces are the result of a form of Principal Component Analysis (PCA), which extracts important facial features from the original image and is based on solving a linear matrix equation of the covariance matrix, eigenvalues and eigenvectors. A face that is to be recognized is thus projected onto the eigenspace; the results of that operation can be interpreted as the comparison of this face with an existing database of known faces. Before executing the actual recognition algorithm, faces need to be located inside the image and prepared (by doing normalization, lighting compensation and noise removal). Many algorithms exist, but this work uses a color based face detection algorithm, which is both fast and sufficient for this application. The face detection and recognition algorithms are implemented on a Blackfin ADSP-BF561 DSP processor from Analog Devices.