5 research outputs found

    Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

    No full text
    Partially observable Markov decision processes are interesting because of their ability to model most conceivable real-world learning problems, for example, robot navigation, driving a car, speech recognition, stock trading, and playing games. The downside of this generality is that exact algorithms are computationally intractable. Such computational complexity motivates approximate approaches. One such class of algorithms are the so-called policy-gradient methods from reinforcement learning. They seek to adjust the parameters of an agent in the direction that maximises the long-term average of a reward signal. Policy-gradient methods are attractive as a \emph{scalable} approach for controlling partially observable Markov decision processes (POMDPs). In the most general case POMDP policies require some form of internal state, or memory, in order to act optimally. Policy-gradient methods have shown promise for problems admitting memory-less policies but have been less successful when memory is required. This thesis develops several improved algorithms for learning policies with memory in an infinite-horizon setting. Directly, when the dynamics of the world are known, and via Monte-Carlo methods otherwise. The algorithms simultaneously learn how to act and what to remember. ..

    Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

    No full text
    Partially observable Markov decision processes are interesting because of their ability to model most conceivable real-world learning problems, for example, robot navigation, driving a car, speech recognition, stock trading, and playing games. The downside of this generality is that exact algorithms are computationally intractable. Such computational complexity motivates approximate approaches. One such class of algorithms are the so-called policy-gradient methods from reinforcement learning. They seek to adjust the parameters of an agent in the direction that maximises the long-term average of a reward signal. Policy-gradient methods are attractive as a \emph{scalable} approach for controlling partially observable Markov decision processes (POMDPs). In the most general case POMDP policies require some form of internal state, or memory, in order to act optimally. Policy-gradient methods have shown promise for problems admitting memory-less policies but have been less successful when memory is required. This thesis develops several improved algorithms for learning policies with memory in an infinite-horizon setting. Directly, when the dynamics of the world are known, and via Monte-Carlo methods otherwise. The algorithms simultaneously learn how to act and what to remember. ..

    Development of algorithms for digital real time image processing on a DSP Processor

    Get PDF
    Rozpoznávanie tvárí je komplexný proces, ktorého hlavným ciežom je rozpoznanie žudskej tváre v obrázku alebo vo video sekvencii. Najčastejšími aplikáciami sú sledovacie a identifikačné systémy. Taktiež je rozpoznávanie tvárí dôležité vo výskume počítačového videnia a umelej inteligencií. Systémy rozpoznávania tvárí sú často založené na analýze obrazu alebo na neurónových sieťach. Táto práca sa zaoberá implementáciou algoritmu založeného na takzvaných „Eigenfaces“ tvárach. „Eigenfaces“ tváre sú výsledkom Analýzy hlavných komponent (Principal Component Analysis - PCA), ktorá extrahuje najdôležitejšie tvárové črty z originálneho obrázku. Táto metóda je založená na riešení lineárnej maticovej rovnice, kde zo známej kovariančnej matice sa počítajú takzvané „eigenvalues“ a „eigenvectors“, v preklade vlastné hodnoty a vlastné vektory. Tvár, ktorá má byť rozpoznaná, sa premietne do takzvaného „eigenspace“ (priestor vlastných hodnôt). Vlastné rozpoznanie je na základe porovnania takýchto tvárí s existujúcou databázou tvárí, ktorá je premietnutá do rovnakého „eigenspace“. Pred procesom rozpoznávania tvárí, musí byť tvár lokalizovaná v obrázku a upravená (normalizácia, kompenzácia svetelných podmienok a odstránenie šumu). Existuje mnoho algoritmov na lokalizáciu tváre, ale v tejto práci je použitý algoritmus lokalizácie tváre na základe farby žudskej pokožky, ktorý je rýchly a postačujúci pre túto aplikáciu. Algoritmy rozpoznávania tváre a lokalizácie tváre sú implementované do DSP procesoru Blackfin ADSP-BF561 od Analog Devices.Face recognition is a complex process that aims to recognize human faces in images or video sequences. Applications include surveillance and identification system, but face recognition is also invaluable in the research of computer vision and artificial intelligence. Face recognition systems are often based on either image analysis or neural networks. This work implements an algorithm based around the use of so-called eigenfaces. Eigenfaces are the result of a form of Principal Component Analysis (PCA), which extracts important facial features from the original image and is based on solving a linear matrix equation of the covariance matrix, eigenvalues and eigenvectors. A face that is to be recognized is thus projected onto the eigenspace; the results of that operation can be interpreted as the comparison of this face with an existing database of known faces. Before executing the actual recognition algorithm, faces need to be located inside the image and prepared (by doing normalization, lighting compensation and noise removal). Many algorithms exist, but this work uses a color based face detection algorithm, which is both fast and sufficient for this application. The face detection and recognition algorithms are implemented on a Blackfin ADSP-BF561 DSP processor from Analog Devices.
    corecore