Search CORE

23 research outputs found

Visual attention in primates and for machines - neuronal mechanisms

Author: Beuth Frederik
Publication venue: Universitätsverlag Chemnitz
Publication date: 09/12/2020
Field of study

Visual attention is an important cognitive concept for the daily life of humans, but still not fully understood. Due to this, it is also rarely utilized in computer vision systems. However, understanding visual attention is challenging as it has many and seemingly-different aspects, both at neuronal and behavioral level. Thus, it is very hard to give a uniform explanation of visual attention that can account for all aspects. To tackle this problem, this thesis has the goal to identify a common set of neuronal mechanisms, which underlie both neuronal and behavioral aspects. The mechanisms are simulated by neuro-computational models, thus, resulting in a single modeling approach to explain a wide range of phenomena at once. In the thesis, the chosen aspects are multiple neurophysiological effects, real-world object localization, and a visual masking paradigm (OSM). In each of the considered fields, the work also advances the current state-of-the-art to better understand this aspect of attention itself. The three chosen aspects highlight that the approach can account for crucial neurophysiological, functional, and behavioral properties, thus the mechanisms might constitute the general neuronal substrate of visual attention in the cortex. As outlook, our work provides for computer vision a deeper understanding and a concrete prototype of attention to incorporate this crucial aspect of human perception in future systems.:1. General introduction 2. The state-of-the-art in modeling visual attention 3. Microcircuit model of attention 4. Object localization with a model of visual attention 5. Object substitution masking 6. General conclusionVisuelle Aufmerksamkeit ist ein wichtiges kognitives Konzept für das tägliche Leben des Menschen. Es ist aber immer noch nicht komplett verstanden, so dass es ein langjähriges Ziel der Neurowissenschaften ist, das Phänomen grundlegend zu durchdringen. Gleichzeitig wird es aufgrund des mangelnden Verständnisses nur selten in maschinellen Sehsystemen in der Informatik eingesetzt. Das Verständnis von visueller Aufmerksamkeit ist jedoch eine komplexe Herausforderung, da Aufmerksamkeit äußerst vielfältige und scheinbar unterschiedliche Aspekte besitzt. Sie verändert multipel sowohl die neuronalen Feuerraten als auch das menschliche Verhalten. Daher ist es sehr schwierig, eine einheitliche Erklärung von visueller Aufmerksamkeit zu finden, welche für alle Aspekte gleichermaßen gilt. Um dieses Problem anzugehen, hat diese Arbeit das Ziel, einen gemeinsamen Satz neuronaler Mechanismen zu identifizieren, welche sowohl den neuronalen als auch den verhaltenstechnischen Aspekten zugrunde liegen. Die Mechanismen werden in neuro-computationalen Modellen simuliert, wodurch ein einzelnes Modellierungsframework entsteht, welches zum ersten Mal viele und verschiedenste Phänomene von visueller Aufmerksamkeit auf einmal erklären kann. Als Aspekte wurden in dieser Dissertation multiple neurophysiologische Effekte, Realwelt Objektlokalisation und ein visuelles Maskierungsparadigma (OSM) gewählt. In jedem dieser betrachteten Felder wird gleichzeitig der State-of-the-Art verbessert, um auch diesen Teilbereich von Aufmerksamkeit selbst besser zu verstehen. Die drei gewählten Gebiete zeigen, dass der Ansatz grundlegende neurophysiologische, funktionale und verhaltensbezogene Eigenschaften von visueller Aufmerksamkeit erklären kann. Da die gefundenen Mechanismen somit ausreichend sind, das Phänomen so umfassend zu erklären, könnten die Mechanismen vielleicht sogar das essentielle neuronale Substrat von visueller Aufmerksamkeit im Cortex darstellen. Für die Informatik stellt die Arbeit damit ein tiefergehendes Verständnis von visueller Aufmerksamkeit dar. Darüber hinaus liefert das Framework mit seinen neuronalen Mechanismen sogar eine Referenzimplementierung um Aufmerksamkeit in zukünftige Systeme integrieren zu können. Aufmerksamkeit könnte laut der vorliegenden Forschung sehr nützlich für diese sein, da es im Gehirn eine Aufgabenspezifische Optimierung des visuellen Systems bereitstellt. Dieser Aspekt menschlicher Wahrnehmung fehlt meist in den aktuellen, starken Computervisionssystemen, so dass eine Integration in aktuelle Systeme deren Leistung sprunghaft erhöhen und eine neue Klasse definieren dürfte.:1. General introduction 2. The state-of-the-art in modeling visual attention 3. Microcircuit model of attention 4. Object localization with a model of visual attention 5. Object substitution masking 6. General conclusio

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Multimedia ONline ARchiv CHemnitz

Object Recognition and Visual Search with a Physiologically Grounded Model of Visual Attention

Author: Beuth Frederik
Hamker Fred H
Publication venue: 'Purdue University (bepress)'
Publication date: 13/05/2015
Field of study

Visual attention models can explain a rich set of physiological data (Reynolds & Heeger, 2009, Neuron), but can rarely link these findings to real-world tasks. Here, we would like to narrow this gap with a novel, physiologically grounded model of visual attention by demonstrating its objects recognition abilities in noisy scenes. To base the model on physiological data, we used a recently developed microcircuit model of visual attention (Beuth & Hamker, in revision, Vision Res) which explains a large set of attention experiments, e.g. biased competition, modulation of contrast response functions, tuning curves, and surround suppression. Objects are represented by object-view specific neurons, learned via a trace learning approach (Antonelli et al., 2014, IEEE TAMD). A visual cortex model combines the microcircuit with neuroanatomical properties like top-down attentional processing, hierarchical-increasing receptive field sizes, and synaptic transmission delays. The visual cortex model is complemented by a model of the frontal eye field (Zirnsak et al., 2011, Eur J Neurosci). We evaluated the model on a realistic object recognition task in which a given target has to be localized in a scene (guided visual search task), using 100 different target objects, 1000 scenes, and two backgrounds. The model achieves an accuracy of 92% at black, and of 71% at white-noise backgrounds. We found that two of the underlying, neuronal attention mechanisms are prominently relevant for guided visual search: amplification of neurons preferring the target; and suppression of neurons encoding distractors or background noise

Purdue E-Pubs

Biologically Inspired Hexagonal Deep Learning for Hexagonal Image Generation

Author: Beuth Frederik
Kowerko Danny
Schlosser Tobias
Publication venue
Publication date: 07/06/2024
Field of study

Whereas conventional state-of-the-art image processing systems of recording and output devices almost exclusively utilize square arranged methods, biological models, however, suggest an alternative, evolutionarily-based structure. Inspired by the human visual perception system, hexagonal image processing in the context of machine learning offers a number of key advantages that can benefit both researchers and users alike. The hexagonal deep learning framework Hexnet leveraged in this contribution serves therefore the generation of hexagonal images by utilizing hexagonal deep neural networks (H-DNN). As the results of our created test environment show, the proposed models can surpass current approaches of conventional image generation. While resulting in a reduction of the models' complexity in the form of trainable parameters, they furthermore allow an increase of test rates in comparison to their square counterparts.Comment: Accepted for: 2020 27th IEEE International Conference on Image Processing (ICIP). arXiv admin note: text overlap with arXiv:1911.1125

arXiv.org e-Print Archive

Learning Object Representations for Modeling Attention in Real World Scenes

Author: Beuth Frederik
Hamker Fred H
Schwarz Alex
Publication venue: 'Purdue University (bepress)'
Publication date: 11/05/2016
Field of study

Models of visual attention have been rarely used in real world tasks as they have been typically developed for psychophysical setups using simple stimuli. Thus, the question remains how objects must be represented to allow such models an operation in real world scenarios. We have previously presented an attention model capable of operating on real-world scenes (Beuth, F., and Hamker, F. H. 2015, NCNC, which is a successor of Hamker, F. H., 2005, Cerebral Cortex), and show here how its object representations have been learned. We have used a learning rule based on temporal continuity (Földiák, P., 1991, Neural Computation) to ensure biological plausibility. Yet, temporal continuity learning rules have not been used in a real world context, thus, we conducted an improvement: We increased the postsynaptic threshold to make the learning more specific, resulting in object-encoding cells reacting mainly specific for their preferred objects. Furthermore, we present a novelty in relation to Beuth, F. and Hamker, F. H., 2015: the learning of object representation invariant towards the background. It is currently unknown how such representations are learned by the human brain. Suggestions have been made to use disparity or motion, whereas we propose temporal continuity learning. This principle learns connections from presynaptic features which are stable over time. As the object changes much less than the background over time, strong connections are primarily learned to the object and no connections to the background. Such learned representations allow the attention model to identify and locate objects in real world scenes

Purdue E-Pubs

A Novel Visual Fault Detection and Classification System for Semiconductor Manufacturing Using Stacked Hybrid Convolutional Neural Networks

Author: Beuth Frederik
Friedrich Michael
Kowerko Danny
Schlosser Tobias
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/03/2021
Field of study

Automated visual inspection in the semiconductor industry aims to detect and classify manufacturing defects utilizing modern image processing techniques. While an earliest possible detection of defect patterns allows quality control and automation of manufacturing chains, manufacturers benefit from an increased yield and reduced manufacturing costs. Since classical image processing systems are limited in their ability to detect novel defect patterns, and machine learning approaches often involve a tremendous amount of computational effort, this contribution introduces a novel deep neural network based hybrid approach. Unlike classical deep neural networks, a multi-stage system allows the detection and classification of the finest structures in pixel size within high-resolution imagery. Consisting of stacked hybrid convolutional neural networks (SH-CNN) and inspired by current approaches of visual attention, the realized system draws the focus over the level of detail from its structures to more task-relevant areas of interest. The results of our test environment show that the SH-CNN outperforms current approaches of learning-based automated visual inspection, whereas a distinction depending on the level of detail enables the elimination of defect patterns in earlier stages of the manufacturing process.Comment: Accepted for: 2019 IEEE 24th International Conference on Emerging Technologies and Factory Automation (ETFA); the latest versions of this contribution cover minor typo correction

arXiv.org e-Print Archive

A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

Author: Antonelli Marco
Beuth Frederik
Canessa Andrea
Chessa Manuela
Chinellato Eris
Del Pobil Angel P.
Duran Angel J.
Gibaldi Agostino
Hamker Fred
Sabatini Silvio P.
Solari Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Reaching a target object in an unknown and unstructured environment is easily performed by human beings. However, designing a humanoid robot that executes the same task requires the implementation of complex abilities, such as identifying the target in the visual field, estimating its spatial location, and precisely driving the motors of the arm to reach it. While research usually tackles the development of such abilities singularly, in this work we integrate a number of computational models into a unified framework, and demonstrate in a humanoid torso the feasibility of an integrated working representation of its peripersonal space. To achieve this goal, we propose a cognitive architecture that connects several models inspired by neural circuits of the visual, frontal and posterior parietal cortices of the brain. The outcome of the integration process is a system that allows the robot to create its internal model and its representation of the surrounding space by interacting with the environment directly, through a mutual adaptation of perception and action. The robot is eventually capable of executing a set of tasks, such as recognizing, gazing and reaching target objects, which can work separately or cooperate for supporting more structured and effective behaviors

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Middlesex University Research Repository

Hong Kong University of Science and Technology Institutional Repository

Archivio istituzionale della ricerca - Università di Genova

Utilizing Generative Adversarial Networks for Image Data Augmentation and Classification of Semiconductor Wafer Dicing Induced Defects

Author: Beuth Frederik
Friedrich Michael
Hu Zhining
Kowerko Danny
Schlosser Tobias
Silva André Luiz Vieira e
Publication venue
Publication date: 24/07/2024
Field of study

In semiconductor manufacturing, the wafer dicing process is central yet vulnerable to defects that significantly impair yield - the proportion of defect-free chips. Deep neural networks are the current state of the art in (semi-)automated visual inspection. However, they are notoriously known to require a particularly large amount of data for model training. To address these challenges, we explore the application of generative adversarial networks (GAN) for image data augmentation and classification of semiconductor wafer dicing induced defects to enhance the variety and balance of training data for visual inspection systems. With this approach, synthetic yet realistic images are generated that mimic real-world dicing defects. We employ three different GAN variants for high-resolution image synthesis: Deep Convolutional GAN (DCGAN), CycleGAN, and StyleGAN3. Our work-in-progress results demonstrate that improved classification accuracies can be obtained, showing an average improvement of up to 23.1 % from 65.1 % (baseline experiment) to 88.2 % (DCGAN experiment) in balanced accuracy, which may enable yield optimization in production.Comment: Accepted for: 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA

arXiv.org e-Print Archive

Visual attention in primates and for machines - neuronal mechanisms

Author: Beuth Frederik
Publication venue: Universitätsverlag Chemnitz
Publication date: 09/12/2020
Field of study

Qucosa

Visual attention in primates and for machines - neuronal mechanisms

Author: Beuth Frederik
Publication venue: Universitätsverlag Chemnitz
Publication date: 09/12/2020
Field of study

HSSS - Hochschulschriftenserver der SLUB