91 research outputs found

    Advanced methods for earth observation data synergy for geophysical parameter retrieval

    Get PDF
    The first part of the thesis focuses on the analysis of relevant factors to estimate the response time between satellite-based and in-situ soil moisture (SM) using a Dynamic Time Warping (DTW). DTW was applied to the SMOS L4 SM, and was compared to in-situ root-zone SM in the REMEDHUS network in Western Spain. The method was customized to control the evolution of time lag during wetting and drying conditions. Climate factors in combination with crop growing seasons were studied to reveal SM-related processes. The heterogeneity of land use was analyzed using high-resolution images of NDVI from Sentinel-2 to provide information about the level of spatial representativity of SMOS data to each in-situ station. The comparison of long-term precipitation records and potential evapotranspiration allowed estimation of SM seasons describing different SM conditions depending on climate and soil properties. The second part of the thesis focuses on data-driven methods for sea ice segmentation and parameter retrieval. A Bayesian framework is employed to segment sets of multi-source satellite data. The Bayesian unsupervised learning algorithm allows to investigate the ‘hidden link’ between multiple data. The statistical properties are accounted for by a Gaussian Mixture Model, and the spatial interactions are reflected using Hidden Markov Random Fields. The algorithm segments spatial data into a number of classes, which are represented as a latent field in physical space and as clusters in feature space. In a first application, a two-step probabilistic approach based on Expectation-Maximization and the Bayesian segmentation algorithm was used to segment SAR images to discriminate surface water from sea ice types. Information on surface roughness is contained in the radar backscattering images which can be - in principle - used to detect melt ponds and to estimate high-resolution sea ice concentration (SIC). In a second study, the algorithm was applied to multi-incidence angle TB data from the SMOS L1C product to harness the its sensitivity to thin ice. The spatial patterns clearly discriminate well-determined areas of open water, old sea ice and a transition zone, which is sensitive to thin sea ice thickness (SIT) and SIC. In a third application, SMOS and the AMSR2 data are used to examine the joint effect of CIMR-like observations. The information contained in the low-frequency channels allows to reveal ranges of thin sea ice, and thicker ice can be determined from the relationship between the high-frequency channels and changing conditions as the sea ice ages. The proposed approach is suitable for merging large data sets and provides metrics for class analysis, and to make informed choices about integrating data from future missions into sea ice products. A regression neural network approach was investigated with the goal to infer SIT using TB data from the Flexible Microwave Payload 2 (FMPL-2) of the FSSCat mission. Two models - covering thin ice up to 0.6m and the full-range of SIT - were trained on Arctic data using ground truth data derived from the SMOS and Cryosat-2. This work demonstrates that moderate-cost CubeSat missions can provide valuable data for applications in Earth observation.La primera parte de la tesis se centra en el análisis de los factores relevantes para estimar el tiempo de respuesta entre la humedad del suelo (SM) basada en el satélite y la in-situ, utilizando una deformación temporal dinámica (DTW). El DTW se aplicó al SMOS L4 SM, y se comparó con la SM in-situ en la red REMEDHUS en el oeste de España. El método se adaptó para controlar la evolución del desfase temporal durante diferentes condiciones de humedad y secado. Se estudiaron los factores climáticos en combinación con los períodos de crecimiento de los cultivos para revelar los procesos relacionados con la SM. La heterogeneidad del uso del suelo se analizó utilizando imágenes de alta resolución de NDVI de Sentinel-2 para proporcionar información sobre el nivel de representatividad espacial de los datos de SMOS a cada estación in situ. La comparación de los patrones de precipitación a largo plazo y la evapotranspiración potencial permitió estimar las estaciones de SM que describen diferentes condiciones de SM en función del clima y las propiedades del suelo. La segunda parte de esta tesis se centra en métodos dirigidos por datos para la segmentación del hielo marino y la obtención de parámetros. Se emplea un método de inferencia bayesiano para segmentar conjuntos de datos satelitales de múltiples fuentes. El algoritmo de aprendizaje bayesiano no supervisado permite investigar el “vínculo oculto” entre múltiples datos. Las propiedades estadísticas se contabilizan mediante un modelo de mezcla gaussiana, y las interacciones espaciales se reflejan mediante campos aleatorios ocultos de Markov. El algoritmo segmenta los datos espaciales en una serie de clases, que se representan como un campo latente en el espacio físico y como clústeres en el espacio de las variables. En una primera aplicación, se utilizó un enfoque probabilístico de dos pasos basado en la maximización de expectativas y el algoritmo de segmentación bayesiano para segmentar imágenes SAR con el objetivo de discriminar el agua superficial de los tipos de hielo marino. La información sobre la rugosidad de la superficie está contenida en las imágenes de backscattering del radar, que puede utilizarse -en principio- para detectar estanques de deshielo y estimar la concentración de hielo marino (SIC) de alta resolución. En un segundo estudio, el algoritmo se aplicó a los datos TB de múltiples ángulos de incidencia del producto SMOS L1C para aprovechar su sensibilidad al hielo fino. Los patrones espaciales discriminan claramente áreas bien determinadas de aguas abiertas, hielo marino viejo y una zona de transición, que es sensible al espesor del hielo marino fino (SIT) y al SIC. En una tercera aplicación, se utilizan los datos de SMOS y de AMSR2 para examinar el efecto conjunto de las observaciones tipo CIMR. La información contenida en los canales de baja frecuencia permite revelar rangos de hielo marino delgado, y el hielo más grueso puede determinarse a partir de la relación entre los canales de alta frecuencia y las condiciones cambiantes a medida que el hielo marino envejece. El enfoque propuesto es adecuado para fusionar grandes conjuntos de datos y proporciona métricas para el análisis de clases, y para tomar decisiones informadas sobre la integración de datos de futuras misiones en los productos de hielo marino. Se investigó un enfoque de red neuronal de regresión con el objetivo de inferir el SIT utilizando datos de TB de la carga útil de microondas flexible 2 (FMPL-2) de la misión FSSCat. Se entrenaron dos modelos - que cubren el hielo fino hasta 0.6 m y el rango completo del SIT - con datos del Ártico utilizando datos de “ground truth” derivados del SMOS y del Cryosat-2. Este trabajo demuestra que las misiones CubeSat de coste moderado pueden proporcionar datos valiosos para aplicaciones de observación de la Tierra.Postprint (published version

    Mathematical Approaches for Image Enhancement Problems

    Get PDF
    This thesis develops novel techniques that can solve some image enhancement problems using theoretically and technically proven and very useful mathematical tools to image processing such as wavelet transforms, partial differential equations, and variational models. Three subtopics are mainly covered. First, color image denoising framework is introduced to achieve high quality denoising results by considering correlations between color components while existing denoising approaches can be plugged in flexibly. Second, a new and efficient framework for image contrast and color enhancement in the compressed wavelet domain is proposed. The proposed approach is capable of enhancing both global and local contrast and brightness as well as preserving color consistency. The framework does not require inverse transform for image enhancement since linear scale factors are directly applied to both scaling and wavelet coefficients in the compressed domain, which results in high computational efficiency. Also contaminated noise in the image can be efficiently reduced by introducing wavelet shrinkage terms adaptively in different scales. The proposed method is able to enhance a wavelet-coded image computationally efficiently with high image quality and less noise or other artifact. The experimental results show that the proposed method produces encouraging results both visually and numerically compared to some existing approaches. Finally, image inpainting problem is discussed. Literature review, psychological analysis, and challenges on image inpainting problem and related topics are described. An inpainting algorithm using energy minimization and texture mapping is proposed. Mumford-Shah energy minimization model detects and preserves edges in the inpainting domain by detecting both the main structure and the detailed edges. This approach utilizes faster hierarchical level set method and guarantees convergence independent of initial conditions. The estimated segmentation results in the inpainting domain are stored in segmentation map, which is referred by a texture mapping algorithm for filling textured regions. We also propose an inpainting algorithm using wavelet transform that can expect better global structure estimation of the unknown region in addition to shape and texture properties since wavelet transforms have been used for various image analysis problems due to its nice multi-resolution properties and decoupling characteristics

    A Computer Vision Story on Video Sequences::From Face Detection to Face Super- Resolution using Face Quality Assessment

    Get PDF

    Extraction and representation of semantic information in digital media

    Get PDF

    Feature-based image patch classification for moving shadow detection

    Get PDF
    Moving object detection is a first step towards many computer vision applications, such as human interaction and tracking, video surveillance, and traffic monitoring systems. Accurate estimation of the target object’s size and shape is often required before higher-level tasks (e.g., object tracking or recog nition) can be performed. However, these properties can be derived only when the foreground object is detected precisely. Background subtraction is a common technique to extract foreground objects from image sequences. The purpose of background subtraction is to detect changes in pixel values within a given frame. The main problem with background subtraction and other related object detection techniques is that cast shadows tend to be misclassified as either parts of the foreground objects (if objects and their cast shadows are bonded together) or independent foreground objects (if objects and shadows are separated). The reason for this phenomenon is the presence of similar characteristics between the target object and its cast shadow, i.e., shadows have similar motion, attitude, and intensity changes as the moving objects that cast them. Detecting shadows of moving objects is challenging because of problem atic situations related to shadows, for example, chromatic shadows, shadow color blending, foreground-background camouflage, nontextured surfaces and dark surfaces. Various methods for shadow detection have been proposed in the liter ature to address these problems. Many of these methods use general-purpose image feature descriptors to detect shadows. These feature descriptors may be effective in distinguishing shadow points from the foreground object in a specific problematic situation; however, such methods often fail to distinguish shadow points from the foreground object in other situations. In addition, many of these moving shadow detection methods require prior knowledge of the scene condi tions and/or impose strong assumptions, which make them excessively restrictive in practice. The aim of this research is to develop an efficient method capable of addressing possible environmental problems associated with shadow detection while simultaneously improving the overall accuracy and detection stability. In this research study, possible problematic situations for dynamic shad ows are addressed and discussed in detail. On the basis of the analysis, a ro bust method, including change detection and shadow detection, is proposed to address these environmental problems. A new set of two local feature descrip tors, namely, binary patterns of local color constancy (BPLCC) and light-based gradient orientation (LGO), is introduced to address the identified problematic situations by incorporating intensity, color, texture, and gradient information. The feature vectors are concatenated in a column-by-column manner to con struct one dictionary for the objects and another dictionary for the shadows. A new sparse representation framework is then applied to find the nearest neighbor of the test image segment by computing a weighted linear combination of the reference dictionary. Image segment classification is then performed based on the similarity between the test image and the sparse representations of the two classes. The performance of the proposed framework on common shadow detec tion datasets is evaluated, and the method shows improved performance com pared with state-of-the-art methods in terms of the shadow detection rate, dis crimination rate, accuracy, and stability. By achieving these significant improve ments, the proposed method demonstrates its ability to handle various problems associated with image processing and accomplishes the aim of this thesis

    Pre-processing, classification and semantic querying of large-scale Earth observation spaceborne/airborne/terrestrial image databases: Process and product innovations.

    Get PDF
    By definition of Wikipedia, “big data is the term adopted for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The big data challenges typically include capture, curation, storage, search, sharing, transfer, analysis and visualization”. Proposed by the intergovernmental Group on Earth Observations (GEO), the visionary goal of the Global Earth Observation System of Systems (GEOSS) implementation plan for years 2005-2015 is systematic transformation of multisource Earth Observation (EO) “big data” into timely, comprehensive and operational EO value-adding products and services, submitted to the GEO Quality Assurance Framework for Earth Observation (QA4EO) calibration/validation (Cal/Val) requirements. To date the GEOSS mission cannot be considered fulfilled by the remote sensing (RS) community. This is tantamount to saying that past and existing EO image understanding systems (EO-IUSs) have been outpaced by the rate of collection of EO sensory big data, whose quality and quantity are ever-increasing. This true-fact is supported by several observations. For example, no European Space Agency (ESA) EO Level 2 product has ever been systematically generated at the ground segment. By definition, an ESA EO Level 2 product comprises a single-date multi-spectral (MS) image radiometrically calibrated into surface reflectance (SURF) values corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its data-derived scene classification map (SCM), whose thematic legend is general-purpose, user- and application-independent and includes quality layers, such as cloud and cloud-shadow. Since no GEOSS exists to date, present EO content-based image retrieval (CBIR) systems lack EO image understanding capabilities. Hence, no semantic CBIR (SCBIR) system exists to date either, where semantic querying is synonym of semantics-enabled knowledge/information discovery in multi-source big image databases. In set theory, if set A is a strict superset of (or strictly includes) set B, then A B. This doctoral project moved from the working hypothesis that SCBIR computer vision (CV), where vision is synonym of scene-from-image reconstruction and understanding EO image understanding (EO-IU) in operating mode, synonym of GEOSS ESA EO Level 2 product human vision. Meaning that necessary not sufficient pre-condition for SCBIR is CV in operating mode, this working hypothesis has two corollaries. First, human visual perception, encompassing well-known visual illusions such as Mach bands illusion, acts as lower bound of CV within the multi-disciplinary domain of cognitive science, i.e., CV is conditioned to include a computational model of human vision. Second, a necessary not sufficient pre-condition for a yet-unfulfilled GEOSS development is systematic generation at the ground segment of ESA EO Level 2 product. Starting from this working hypothesis the overarching goal of this doctoral project was to contribute in research and technical development (R&D) toward filling an analytic and pragmatic information gap from EO big sensory data to EO value-adding information products and services. This R&D objective was conceived to be twofold. First, to develop an original EO-IUS in operating mode, synonym of GEOSS, capable of systematic ESA EO Level 2 product generation from multi-source EO imagery. EO imaging sources vary in terms of: (i) platform, either spaceborne, airborne or terrestrial, (ii) imaging sensor, either: (a) optical, encompassing radiometrically calibrated or uncalibrated images, panchromatic or color images, either true- or false color red-green-blue (RGB), multi-spectral (MS), super-spectral (SS) or hyper-spectral (HS) images, featuring spatial resolution from low (> 1km) to very high (< 1m), or (b) synthetic aperture radar (SAR), specifically, bi-temporal RGB SAR imagery. The second R&D objective was to design and develop a prototypical implementation of an integrated closed-loop EO-IU for semantic querying (EO-IU4SQ) system as a GEOSS proof-of-concept in support of SCBIR. The proposed closed-loop EO-IU4SQ system prototype consists of two subsystems for incremental learning. A primary (dominant, necessary not sufficient) hybrid (combined deductive/top-down/physical model-based and inductive/bottom-up/statistical model-based) feedback EO-IU subsystem in operating mode requires no human-machine interaction to automatically transform in linear time a single-date MS image into an ESA EO Level 2 product as initial condition. A secondary (dependent) hybrid feedback EO Semantic Querying (EO-SQ) subsystem is provided with a graphic user interface (GUI) to streamline human-machine interaction in support of spatiotemporal EO big data analytics and SCBIR operations. EO information products generated as output by the closed-loop EO-IU4SQ system monotonically increase their value-added with closed-loop iterations

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Interactive models for latent information discovery in satellite images

    Get PDF
    The recent increase in Earth Observation (EO) missions has resulted in unprecedented volumes of multi-modal data to be processed, understood, used and stored in archives. The advanced capabilities of satellite sensors become useful only when translated into accurate, focused information, ready to be used by decision makers from various fields. Two key problems emerge when trying to bridge the gap between research, science and multi-user platforms: (1) The current systems for data access permit only queries by geographic location, time of acquisition, type of sensor, but this information is often less important than the latent, conceptual content of the scenes; (2) simultaneously, many new applications relying on EO data require the knowledge of complex image processing and computer vision methods for understanding and extracting information from the data. This dissertation designs two important concept modules of a theoretical image information mining (IIM) system for EO: semantic knowledge discovery in large databases and data visualization techniques. These modules allow users to discover and extract relevant conceptual information directly from satellite images and generate an optimum visualization for this information. The first contribution of this dissertation brings a theoretical solution that bridges the gap and discovers the semantic rules between the output of state-of-the-art classification algorithms and the semantic, human-defined, manually-applied terminology of cartographic data. The set of rules explain in latent, linguistic concepts the contents of satellite images and link the low-level machine language to the high-level human understanding. The second contribution of this dissertation is an adaptive visualization methodology used to assist the image analyst in understanding the satellite image through optimum representations and to offer cognitive support in discovering relevant information in the scenes. It is an interactive technique applied to discover the optimum combination of three spectral features of a multi-band satellite image that enhance visualization of learned targets and phenomena of interest. The visual mining module is essential for an IIM system because all EO-based applications involve several steps of visual inspection and the final decision about the information derived from satellite data is always made by a human operator. To ensure maximum correlation between the requirements of the analyst and the possibilities of the computer, the visualization tool models the human visual system and secures that a change in the image space is equivalent to a change in the perception space of the operator. This thesis presents novel concepts and methods that help users access and discover latent information in archives and visualize satellite scenes in an interactive, human-centered and information-driven workflow.Der aktuelle Anstieg an Erdbeobachtungsmissionen hat zu einem Anstieg von multi-modalen Daten geführt die verarbeitet, verstanden, benutzt und in Archiven gespeichert werden müssen. Die erweiterten Fähigkeiten von Satellitensensoren sind nur dann von Entscheidungstraegern nutzbar, wenn sie in genaue, fokussierte Information liefern. Es bestehen zwei Schlüsselprobleme beim Versuch die Lücke zwischen Forschung, Wissenschaft und Multi-User-Systeme zu füllen: (1) Die aktuellen Systeme für Datenzugriffe erlauben nur Anfragen basierend auf geografischer Position, Aufzeichnungszeit, Sensortyp. Aber diese Informationen sind oft weniger wichtig als der latente, konzeptuelle Inhalt der Szenerien. (2) Viele neue Anwendungen von Erdbeobachtungsdaten benötigen Wissen über komplexe Bildverarbeitung und Computer Vision Methoden um Information verstehen und extrahieren zu können. Diese Dissertation zeigt zwei wichtige Konzeptmodule eines theoretischen Image Information Mining (IIM) Systems für Erdbeobachtung auf: Semantische Informationsentdeckung in grossen Datenbanken und Datenvisualisierungstechniken. Diese Module erlauben Benutzern das Entdecken und Extrahieren relevanter konzeptioneller Informationen direkt aus Satellitendaten und die Erzeugung von optimalen Visualisierungen dieser Informationen. Der erste Beitrag dieser Dissertation bringt eine theretische Lösung welche diese Lücke überbrückt und entdeckt semantische Regeln zwischen dem Output von state-of-the-art Klassifikationsalgorithmen und semantischer, menschlich definierter, manuell angewendete Terminologie von kartographischen Daten. Ein Satz von Regeln erkläret in latenten, linguistischen Konzepten den Inhalte von Satellitenbildern und verbinden die low-level Maschinensprache mit high-level menschlichen Verstehen. Der zweite Beitrag dieser Dissertation ist eine adaptive Visualisierungsmethode die einem Bildanalysten im Verstehen der Satellitenbilder durch optimale Repräsentation hilft und die kognitive Unterstützung beim Entdecken von relevenanter Informationen in Szenerien bietet. Die Methode ist ein interaktive Technik die angewendet wird um eine optimale Kombination von von drei Spektralfeatures eines Multiband-Satellitenbildes welche die Visualisierung von gelernten Zielen and Phänomenen ermöglichen. Das visuelle Mining-Modul ist essentiell für IIM Systeme da alle erdbeobachtungsbasierte Anwendungen mehrere Schritte von visueller Inspektion benötigen und davon abgeleitete Informationen immer vom Operator selbst gemacht werden müssen. Um eine maximale Korrelation von Anforderungen des Analysten und den Möglichkeiten von Computern sicher zu stellen, modelliert das Visualisierungsmodul das menschliche Wahrnehmungssystem und stellt weiters sicher, dass eine Änderung im Bildraum äquivalent zu einer Änderung der Wahrnehmung durch den Operator ist. Diese These präsentieret neuartige Konzepte und Methoden, die Anwendern helfen latente Informationen in Archiven zu finden und visualisiert Satellitenszenen in einem interaktiven, menschlich zentrierten und informationsgetriebenen Arbeitsprozess

    Task- and Knowledge-Driven Scene Representation: A Flexible On-Demand System Architecture for Vision

    Get PDF
    In this thesis a flexible system architecture is presented along with an attention control mechanism allowing for a task-dependent representation of visual scenes. Contrary to existing approaches, which measure all properties of an object, the proposed system only processes and stores information relevant for solving a given task. The system comprises a short- and long-term memory, a spatial saliency algorithm and multiple independent processing routines to extract visual properties of objects. Here, the proposed control mechanism decides which properties need to be extracted and which processing routines should be coupled in order to effectively solve the task. This decision is based on the knowledge stored in the long-term memory of the system. An experimental evaluation on a real-world scene shows that, while solving the given task, the computational load and the amount of data stored by the system are considerably reduced compared to state-of-the-art systems.Die Umgebung des Menschen ist voller visueller Details. Diese immense Menge an Information kann, unter der Annahme von begrenzten Verarbeitungs- und Speicherresourcen, nur teilweise aufgenommen und gespeichert werden. Daraus ergibt sich die Notwendigkeit einer selektiven Verarbeitung, die, je nach Aufgabenstellung, zu einer unterschiedlichen Repräsentation der visuellen Szene führt. Psychophysische Experimente zeigen, dass dabei die erfasste Umgebung nicht nur örtlich, sondern auch im Merkmalsraum selektiv bearbeitet wird, dass heißt es wird nur die visuelle Information aufgenommen, die für das Lösen der jeweiligen Aufgabe erforderlich ist. Im Rahmen dieser Arbeit werden eine flexible Systemarchitektur und eine Kontrollstruktur zur aufgabenbezogenen Szenenrepräsentation vorgestellt. Im Gegensatz zu existierenden Arbeiten ermöglicht dieser Ansatz eine selektive Informationsaufnahme. Die vorgeschlagene Architektur enthält neben einem Lang- und Kurzzeitgedächtnis sowie einer Aufmerksamkeitskarte auch mehrere Verarbeitungsmodule zur Merkmalsextraktion. Diese Verarbeitungsmodule sind spezialisiert auf die Extraktion eines Merkmals und arbeiten unabhängig voneinander. Sie können jedoch je nach Aufgabenstellung dynamisch miteinander gekoppelt werden um gezielt die benötigte Information aus der Szene zu extrahieren. Die Entscheidung, welche Information benötigt wird und welche Module zur Extraktion dieser Merkmale gekoppelt werden müssen, trifft die im Rahmen der Arbeit entwickelte Kontrollstruktur, welche das gespeicherte Wissen des Systems und die gestellte Aufgabe berücksichtigt. Weiterhin stellt die Kontrollstruktur sicher, dass algorithmische Abhängigkeiten zwischen den Verarbeitungsmodulen unter Zuhilfenahme von systemimmanentem Prozesswissen automatisch aufgelöst werden. Die hier vorgestellte Systemarchitektur und die ebenfalls vorgeschlagene Kontrollstruktur werden experimentell anhand einer realen Tischszene evaluiert. Bei den durchgeführten Experimenten zeigt sich, dass bei Lösung einer gestellten Aufgabe die Menge der vom System verarbeiteten und gespeicherten Informationen deutlich reduziert wird. In der Folge werden die Anforderungen an die Verarbeitungs- und Speicherressourcen ebenfalls deutlich reduziert. Diese Arbeit leistet damit einen Beitrag zur aufgabenbezogenen Repräsentation von visuellen Szenen, da nur noch die Information verarbeitet und gespeichert wird, die tatsächlich zur Lösung der Aufgabe erforderlich ist.The visual environment of humans is full of details. This incredible amount of data can neither be processed nor stored when assuming a limited computational power and memory capacity. Consequently, a selective processing is necessary, which leads to different representations of the same scene depending on the given task. Psychophysical experiments show that both the spatial domain as well as the feature domain are parsed selectively. In doing so, only those information are extracted from the visual scene that are required to solve a given task. This thesis proposes a flexible system architecture along with a control mechanism that allows for a task-dependent representation of a visual scene. Contrary to existing approaches, the resulting system is able to acquire information selectively according to the demands of the given task. This system comprises both a short-term and a long-term memory, a spatial saliency algorithm and multiple visual processing modules used to extract visual properties of a focused object. At this, the different visual processing modules operate independently and are specialized in extracting only a single visual property. However, the dynamic coupling of multiple processing modules allows for the extraction of specific more complex features that are relevant for solving the given task. Here, the proposed control mechanism decides which properties need to be extracted and which processing modules should be coupled. This decision is based on the knowledge stored in the long-term memory of the system. Additionally, the control mechanism ensures that algorithmic dependencies between processing modules are resolved automatically, utilizing procedural knowledge which is also stored in the long-term memory. A proof-of-concept system is implemented according to the system architecture and the control mechanism presented in this thesis. The experimental evaluation using a real-world table scene shows that, while solving the given task, the amount of data processed and stored by the system is considerably lower compared to processing regimes used in state-of-the-art systems. This in turn leads to a noticeable reduction of the computational load and memory demand. In doing so, the present thesis contributes to a task-dependent representation of visual scenes, because only those information are acquired and stored that are relevant for solving the given task
    corecore