Search CORE

13 research outputs found

Biologically-inspired hierarchical architectures for object recognition

Author: Alameer Ali Munthr Abdulkareem
Publication venue: Newcastle University
Publication date: 01/01/2018
Field of study

PhD ThesisThe existing methods for machine vision translate the three-dimensional objects in the real world into two-dimensional images. These methods have achieved acceptable performances in recognising objects. However, the recognition performance drops dramatically when objects are transformed, for instance, the background, orientation, position in the image, and scale. The human’s visual cortex has evolved to form an efficient invariant representation of objects from within a scene. The superior performance of human can be explained by the feed-forward multi-layer hierarchical structure of human visual cortex, in addition to, the utilisation of different fields of vision depending on the recognition task. Therefore, the research community investigated building systems that mimic the hierarchical architecture of the human visual cortex as an ultimate objective. The aim of this thesis can be summarised as developing hierarchical models of the visual processing that tackle the remaining challenges of object recognition. To enhance the existing models of object recognition and to overcome the above-mentioned issues, three major contributions are made that can be summarised as the followings 1. building a hierarchical model within an abstract architecture that achieves good performances in challenging image object datasets; 2. investigating the contribution for each region of vision for object and scene images in order to increase the recognition performance and decrease the size of the processed data; 3. further enhance the performance of all existing models of object recognition by introducing hierarchical topologies that utilise the context in which the object is found to determine the identity of the object. Statement ofHigher Committee For Education Development in Iraq (HCED

Newcastle University eTheses

Methods and Apparatus for Autonomous Robotic Control

Author: Gorshechnikov Anatoly
Versace Massimiliano
Publication venue
Publication date
Field of study

Sensory processing of visual, auditory, and other sensor information (e.g., visual imagery, LIDAR, RADAR) is conventionally based on "stovepiped," or isolated processing, with little interactions between modules. Biological systems, on the other hand, fuse multi-sensory information to identify nearby objects of interest more quickly, more efficiently, and with higher signal-to-noise ratios. Similarly, examples of the OpenSense technology disclosed herein use neurally inspired processing to identify and locate objects in a robot's environment. This enables the robot to navigate its environment more quickly and with lower computational and power requirements

NASA Technical Reports Server

Humanoid Robots

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

For many years, the human being has been trying, in all ways, to recreate the complex mechanisms that form the human body. Such task is extremely complicated and the results are not totally satisfactory. However, with increasing technological advances based on theoretical and experimental researches, man gets, in a way, to copy or to imitate some systems of the human body. These researches not only intended to create humanoid robots, great part of them constituting autonomous systems, but also, in some way, to offer a higher knowledge of the systems that form the human body, objectifying possible applications in the technology of rehabilitation of human beings, gathering in a whole studies related not only to Robotics, but also to Biomechanics, Biomimmetics, Cybernetics, among other areas. This book presents a series of researches inspired by this ideal, carried through by various researchers worldwide, looking for to analyze and to discuss diverse subjects related to humanoid robots. The presented contributions explore aspects about robotic hands, learning, language, vision and locomotion

Directory of Open Access Books (DOAB)

Recommended from our members

Computational models of the human visual cortex: on individual differences and ecologically valid input statistics

Author: Mehrer Johannes
Publication venue: University of Cambridge
Publication date: 03/07/2020
Field of study

Perception relies on cortical processes in response to sensory stimuli. Visual input entering the eyes ascends a cascade of processing steps from the retina to high-level regions of the cortex. Vision science investigates these transformations that give rise to high-level processing of visual objects, such as object recognition. In this thesis I investigate computational models of the human visual cortex with regard to their ability to predict cortical responses to visual objects. In particular, I describe two factors playing an important role in using deep neural networks (DNNs) to better understand cortical functioning: the initial weight state and ecologically more valid input statistics. In Chapter 1 of this thesis I will introduce relevant literature pertaining to deep neural networks as a modeling framework for the visual cortex. Next, I will lay out the motivation for the research questions investigated in this thesis and described in detail in Chapters 2, 3, and 4. Chapter 2 focuses on the impact of the initial weight state of a model on its ability to predict cortical representations. I describe work in which we demonstrate that two DNN instances identical in every aspect but their initial weights, yield very dissimilar representations. Relying on single network instances to predict cortical activation patterns in response to sensory stimuli poses a problem for computational neuroscience: depending on the initial set of weights the ability to mirror the cortical representations of these stimuli might vary. Thus, results based on single (“off-the-shelf”) model instances - as commonly used in computational neuroscience - may not generalize. In contrast, using multiple DNN instances might alleviate this problem as they allow insights in the variability of a given model architecture to predict cortical representations. These individual differences between model instances suggest that to allow results to generalize more easily the model instances should be treated similar to human experimental participants. In Chapter 3 I focus on ecologically more valid input statistics (in the form of training images) aiming to improve a model’s ability to predict cortical representations. The most successful models of the human visual cortex to date are DNNs trained on object recognition tasks designed with machine learning goals in mind. However, the image sets used for training these DNNs are often not ecologically realistic. For example, training on the most-widely used image set in computational neuroscience (ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012) requires the fine-grained distinction of 120 dog breeds, but does not contain visual object categories encountered frequently in everyday human life (e.g. woman, man, or child). This suggests that taking into account the human visual experience when training models of the human visual cortex on a categorization task might help to predict cortical representations. In this Chapter I describe the creation of a set of images aimed at mimicking the human visual diet: ecoset. Ecoset contains more than 1.5 million images from 565 basic level categories and is the largest image set specifically designed for computational neuroscience to date. Ecoset is freely available to allow the community to test their own hypotheses of models trained with input statistics matched to the human visual environment. In Chapter 4 we build on the results from the previous two Chapters. Using multiple DNN instances I investigate whether a brain-inspired model architecture (vNet) trained on ecologically more valid input statistics (ecoset) might improve its ability to predict cortical representations. I first demonstrate that ecoset might improve an architecture’s ability to mirror cortical representations. Furthermore, ecoset-trained vNet also outperforms state-ofthe- art computer vision and computational neuroscience models in terms of mirroring cortical representations in the human brain. Thus, incorporating biological and ecological aspects, such as brain-inspired architectural features and ecologically more valid input statistics, into computational models may yield better predictions of response patterns in the human visual cortex. Treating DNN instances similar to human experimental participants and considering ecological and biological factors for building these DNNs may be an important step towards better models of the human visual cortex. Such models might allow a better understanding of the cortical processes underlying high-level vision in the human brain.Cambridge Trust - Vice Chancellor's Award 2015 Cambridge Philosophical Society MRC Cognition and Brain Sciences Uni

Apollo (Cambridge)

Pre-processing, classification and semantic querying of large-scale Earth observation spaceborne/airborne/terrestrial image databases: Process and product innovations.

Author: Baraldi Andrea
Publication venue
Publication date: 10/04/2017
Field of study

By definition of Wikipedia, “big data is the term adopted for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The big data challenges typically include capture, curation, storage, search, sharing, transfer, analysis and visualization”. Proposed by the intergovernmental Group on Earth Observations (GEO), the visionary goal of the Global Earth Observation System of Systems (GEOSS) implementation plan for years 2005-2015 is systematic transformation of multisource Earth Observation (EO) “big data” into timely, comprehensive and operational EO value-adding products and services, submitted to the GEO Quality Assurance Framework for Earth Observation (QA4EO) calibration/validation (Cal/Val) requirements. To date the GEOSS mission cannot be considered fulfilled by the remote sensing (RS) community. This is tantamount to saying that past and existing EO image understanding systems (EO-IUSs) have been outpaced by the rate of collection of EO sensory big data, whose quality and quantity are ever-increasing. This true-fact is supported by several observations. For example, no European Space Agency (ESA) EO Level 2 product has ever been systematically generated at the ground segment. By definition, an ESA EO Level 2 product comprises a single-date multi-spectral (MS) image radiometrically calibrated into surface reflectance (SURF) values corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its data-derived scene classification map (SCM), whose thematic legend is general-purpose, user- and application-independent and includes quality layers, such as cloud and cloud-shadow. Since no GEOSS exists to date, present EO content-based image retrieval (CBIR) systems lack EO image understanding capabilities. Hence, no semantic CBIR (SCBIR) system exists to date either, where semantic querying is synonym of semantics-enabled knowledge/information discovery in multi-source big image databases. In set theory, if set A is a strict superset of (or strictly includes) set B, then A B. This doctoral project moved from the working hypothesis that SCBIR computer vision (CV), where vision is synonym of scene-from-image reconstruction and understanding EO image understanding (EO-IU) in operating mode, synonym of GEOSS ESA EO Level 2 product human vision. Meaning that necessary not sufficient pre-condition for SCBIR is CV in operating mode, this working hypothesis has two corollaries. First, human visual perception, encompassing well-known visual illusions such as Mach bands illusion, acts as lower bound of CV within the multi-disciplinary domain of cognitive science, i.e., CV is conditioned to include a computational model of human vision. Second, a necessary not sufficient pre-condition for a yet-unfulfilled GEOSS development is systematic generation at the ground segment of ESA EO Level 2 product. Starting from this working hypothesis the overarching goal of this doctoral project was to contribute in research and technical development (R&D) toward filling an analytic and pragmatic information gap from EO big sensory data to EO value-adding information products and services. This R&D objective was conceived to be twofold. First, to develop an original EO-IUS in operating mode, synonym of GEOSS, capable of systematic ESA EO Level 2 product generation from multi-source EO imagery. EO imaging sources vary in terms of: (i) platform, either spaceborne, airborne or terrestrial, (ii) imaging sensor, either: (a) optical, encompassing radiometrically calibrated or uncalibrated images, panchromatic or color images, either true- or false color red-green-blue (RGB), multi-spectral (MS), super-spectral (SS) or hyper-spectral (HS) images, featuring spatial resolution from low (> 1km) to very high (< 1m), or (b) synthetic aperture radar (SAR), specifically, bi-temporal RGB SAR imagery. The second R&D objective was to design and develop a prototypical implementation of an integrated closed-loop EO-IU for semantic querying (EO-IU4SQ) system as a GEOSS proof-of-concept in support of SCBIR. The proposed closed-loop EO-IU4SQ system prototype consists of two subsystems for incremental learning. A primary (dominant, necessary not sufficient) hybrid (combined deductive/top-down/physical model-based and inductive/bottom-up/statistical model-based) feedback EO-IU subsystem in operating mode requires no human-machine interaction to automatically transform in linear time a single-date MS image into an ESA EO Level 2 product as initial condition. A secondary (dependent) hybrid feedback EO Semantic Querying (EO-SQ) subsystem is provided with a graphic user interface (GUI) to streamline human-machine interaction in support of spatiotemporal EO big data analytics and SCBIR operations. EO information products generated as output by the closed-loop EO-IU4SQ system monotonically increase their value-added with closed-loop iterations

Università degli Studi di Napoli Federico Il Open Archive

The Future of Humanoid Robots

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book

Directory of Open Access Books (DOAB)

BEYOND MULTI-TARGET TRACKING: STATISTICAL PATTERN ANALYSIS OF PEOPLE AND GROUPS

Author: Bazzani Loris
Publication venue
Publication date: 01/01/2012
Field of study

Ogni giorno milioni e milioni di videocamere monitorano la vita quotidiana delle persone, registrando e collezionando una grande quantit\ue0 di dati. Questi dati possono essere molto utili per scopi di video-sorveglianza: dalla rilevazione di comportamenti anomali all'analisi del traffico urbano nelle strade. Tuttavia i dati collezionati vengono usati raramente, in quanto non \ue8 pensabile che un operatore umano riesca a esaminare manualmente e prestare attenzione a una tale quantit\ue0 di dati simultaneamente. Per questo motivo, negli ultimi anni si \ue8 verificato un incremento della richiesta di strumenti per l'analisi automatica di dati acquisiti da sistemi di video-sorveglianza in modo da estrarre informazione di pi\uf9 alto livello (per esempio, John, Sam e Anne stanno camminando in gruppo al parco giochi vicino alla stazione) a partire dai dati a disposizione che sono solitamente a basso livello e ridondati (per esempio, una sequenza di immagini). L'obiettivo principale di questa tesi \ue8 quello di proporre soluzioni e algoritmi automatici che permettono di estrarre informazione ad alto livello da una zona di interesse che viene monitorata da telecamere. Cos\uec i dati sono rappresentati in modo da essere facilmente interpretabili e analizzabili da qualsiasi persona. In particolare, questo lavoro \ue8 focalizzato sull'analisi di persone e i loro comportamenti sociali collettivi. Il titolo della tesi, beyond multi-target tracking, evidenzia lo scopo del lavoro: tutti i metodi proposti in questa tesi che si andranno ad analizzare hanno come comune denominatore il target tracking. Inoltre andremo oltre le tecniche standard per arrivare a una rappresentazione del dato a pi\uf9 alto livello. Per prima cosa, analizzeremo il problema del target tracking in quanto \ue8 alle basi di questo lavoro. In pratica, target tracking significa stimare la posizione di ogni oggetto di interesse in un immagine e la sua traiettoria nel tempo. Analizzeremo il problema da due prospettive complementari: 1) il punto di vista ingegneristico, dove l'obiettivo \ue8 quello di creare algoritmi che ottengono i risultati migliori per il problema in esame. 2) Il punto di vista della neuroscienza: motivati dalle teorie che cercano di spiegare il funzionamento del sistema percettivo umano, proporremo in modello attenzionale per tracking e il riconoscimento di oggetti e persone. Il secondo problema che andremo a esplorare sar\ue0 l'estensione del tracking alla situazione dove pi\uf9 telecamere sono disponibili. L'obiettivo \ue8 quello di mantenere un identificatore univoco per ogni persona nell'intera rete di telecamere. In altre parole, si vuole riconoscere gli individui che vengono monitorati in posizioni e telecamere diverse considerando un database di candidati. Tale problema \ue8 chiamato in letteratura re-indetificazione di persone. In questa tesi, proporremo un modello standard di come affrontare il problema. In questo modello, presenteremo dei nuovi descrittori di aspetto degli individui, in quanto giocano un ruolo importante allo scopo di ottenere i risultati migliori. Infine raggiungeremo il livello pi\uf9 alto di rappresentazione dei dati che viene affrontato in questa tesi, che \ue8 l'analisi di interazioni sociali tra persone. In particolare, ci focalizzeremo in un tipo specifico di interazione: il raggruppamento di persone. Proporremo dei metodi di visione computazionale che sfruttano nozioni di psicologia sociale per rilevare gruppi di persone. Inoltre, analizzeremo due modelli probabilistici che affrontano il problema di tracking (congiunto) di gruppi e individui.Every day millions and millions of surveillance cameras monitor the world, recording and collecting huge amount of data. The collected data can be extremely useful: from the behavior analysis to prevent unpleasant events, to the analysis of the traffic. However, these valuable data is seldom used, because of the amount of information that the human operator has to manually attend and examine. It would be like looking for a needle in the haystack. The automatic analysis of data is becoming mandatory for extracting summarized high-level information (e.g., John, Sam and Anne are walking together in group at the playground near the station) from the available redundant low-level data (e.g., an image sequence). The main goal of this thesis is to propose solutions and automatic algorithms that perform high-level analysis of a camera-monitored environment. In this way, the data are summarized in a high-level representation for a better understanding. In particular, this work is focused on the analysis of moving people and their collective behaviors. The title of the thesis, beyond multi-target tracking, mirrors the purpose of the work: we will propose methods that have the target tracking as common denominator, and go beyond the standard techniques in order to provide a high-level description of the data. First, we investigate the target tracking problem as it is the basis of all the next work. Target tracking estimates the position of each target in the image and its trajectory over time. We analyze the problem from two complementary perspectives: 1) the engineering point of view, where we deal with problem in order to obtain the best results in terms of accuracy and performance. 2) The neuroscience point of view, where we propose an attentional model for tracking and recognition of objects and people, motivated by theories of the human perceptual system. Second, target tracking is extended to the camera network case, where the goal is to keep a unique identifier for each person in the whole network, i.e., to perform person re-identification. The goal is to recognize individuals in diverse locations over different non-overlapping camera views or also the same camera, considering a large set of candidates. In this context, we propose a pipeline and appearance-based descriptors that enable us to define in a proper way the problem and to reach the-state-of-the-art results. Finally, the higher level of description investigated in this thesis is the analysis (discovery and tracking) of social interaction between people. In particular, we focus on finding small groups of people. We introduce methods that embed notions of social psychology into computer vision algorithms. Then, we extend the detection of social interaction over time, proposing novel probabilistic models that deal with (joint) individual-group tracking

Catalogo dei prodotti della ricerca

Aspects of algorithms and dynamics of cellular paradigms

Author: Pazienza Giovanni Egidio
Publication venue: Blanquerna - Universitat Ramon Llull
Publication date: 01/01/2008
Field of study

Els paradigmes cel·lulars, com les xarxes neuronals cel·lulars (CNN, en anglès) i els autòmats cel·lulars (CA, en anglès), són una eina excel·lent de càlcul, al ser equivalents a una màquina universal de Turing. La introducció de la màquina universal CNN (CNN-UM, en anglès) ha permès desenvolupar hardware, el nucli computacional del qual funciona segons la filosofia cel·lular; aquest hardware ha trobat aplicació en diversos camps al llarg de la darrera dècada. Malgrat això, encara hi ha moltes preguntes a obertes sobre com definir els algoritmes d'una CNN-UM i com estudiar la dinàmica dels autòmats cel·lulars. En aquesta tesis es tracten els dos problemes: primer, es demostra que es possible acotar l'espai dels algoritmes per a la CNN-UM i explorar-lo gràcies a les tècniques genètiques; i segon, s'expliquen els fonaments de l'estudi dels CA per mitjà de la dinàmica no lineal (segons la definició de Chua) i s'il·lustra com aquesta tècnica ha permès trobar resultats innovadors.Los paradigmas celulares, como las redes neuronales celulares (CNN, eninglés) y los autómatas celulares (CA, en inglés), son una excelenteherramienta de cálculo, al ser equivalentes a una maquina universal deTuring. La introducción de la maquina universal CNN (CNN-UM, eninglés) ha permitido desarrollar hardware cuyo núcleo computacionalfunciona según la filosofía celular; dicho hardware ha encontradoaplicación en varios campos a lo largo de la ultima década. Sinembargo, hay aun muchas preguntas abiertas sobre como definir losalgoritmos de una CNN-UM y como estudiar la dinámica de los autómatascelular. En esta tesis se tratan ambos problemas: primero se demuestraque es posible acotar el espacio de los algoritmos para la CNN-UM yexplorarlo gracias a técnicas genéticas; segundo, se explican losfundamentos del estudio de los CA por medio de la dinámica no lineal(según la definición de Chua) y se ilustra como esta técnica hapermitido encontrar resultados novedosos.Cellular paradigms, like Cellular Neural Networks (CNNs) and Cellular Automata (CA) are an excellent tool to perform computation, since they are equivalent to a Universal Turing machine. The introduction of the Cellular Neural Network - Universal Machine (CNN-UM) allowed us to develop hardware whose computational core works according to the principles of cellular paradigms; such a hardware has found application in a number of fields throughout the last decade. Nevertheless, there are still many open questions about how to define algorithms for a CNN-UM, and how to study the dynamics of Cellular Automata. In this dissertation both problems are tackled: first, we prove that it is possible to bound the space of all algorithms of CNN-UM and explore it through genetic techniques; second, we explain the fundamentals of the nonlinear perspective of CA (according to Chua's definition), and we illustrate how this technique has allowed us to find novel results

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Engineering for a Changing World: 59th IWK, Ilmenau Scientific Colloquium, Technische Universität Ilmenau, September 11-15, 2017 : programme

Author
Publication venue: Universitätsverlag Ilmenau
Publication date: 11/09/2017
Field of study

In 2017, the Ilmenau Scientific Colloquium is again organised by the Department of Mechanical Engineering. The title of this year’s conference “Engineering for a Changing World” refers to limited natural resources of our planet, to massive changes in cooperation between continents, countries, institutions and people – enabled by the increased implementation of information technology as the probably most dominant driver in many fields. The Colloquium, complemented by workshops, is characterised by the following topics, but not limited to them: – Precision Engineering and Metrology – Industry 4.0 and Digitalisation in Mechanical Engineering – Mechatronics, Biomechatronics and Mechanism Technology – Systems Technology – Innovative Metallic Materials The topics are oriented on key strategic aspects of research and teaching in Mechanical Engineering at our university

Digitale Bibliothek Thüringen

Smart vision in system-on-chip applications

Author: Wells Cade Cenric
Publication venue: ProQuest Dissertations & Theses,
Publication date: 01/01/2005
Field of study

In the last decade the ability to design and manufacture integrated circuits with higher transistor densities has led to the integration of complete systems on a single silicon die. These are commonly referred to as System-on-Chip (SoC). As SoCs processes can incorporate multiple technologies it is now feasible to produce single chip camera systems with embedded image processing, known as Imager-on-Chips (IoC). The development of IoCs is complicated due to the mixture of digital and analog components and the high cost of prototyping these designs using silicon processes. There are currently no re-usable prototyping platforms that specifically address the needs of IoC development. This thesis details a new prototyping platform specifically for use in the development of low-cost mass-market IoC applications. FPGA technology was utilised to implement a frame-based processing architecture suitable for supporting a range of real-time imaging and machine vision applications. To demonstrate the effectiveness of the prototyping platform, an example object counting and highlighting application was developed and functionally verified in real-time. A high-level IoC cost model was formulated to calculate the cost of manufacturing prototyped applications as a single IoC. This highlighted the requirement for careful analysis of optical issues, embedded imager array size and the silicon process used to ensure the desired IoC unit cost was achieved. A modified version of the FPGA architecture, which would result in improving the DSP performance, is also proposed

Glasgow Theses Service

OpenGrey Repository