46 research outputs found

    Close range three-dimensional position sensing using stereo matching with Hopfield neural networks.

    Get PDF
    In recent years Vision Systems have found their ways into many real-world applications. This includes such fields as surveillance and tracking, computer graphics and various factory settings such as assembly line inspection and object manipulation. The application of Computer Vision techniques to factory automation, Machine Vision, is a growing field. However in most Machine Vision systems an algorithm is needed to infer 3D information regarding the objects in the field of view. Such a task can be accomplished using a Stereo Vision algorithm. In this thesis a new Machine Vision Algorithm for Close-Range Position Sensing is presented where a Hopfield Neural Network is used for the Stereo Matching stage: stereo Matching is formulated as an energy minimization task which is accomplished using the Hopfield Neural Networks. Various other important aspects of this Vision System are discussed including camera calibration and objects localization. Source: Masters Abstracts International, Volume: 45-01, page: 0423. Thesis (M.A.Sc.)--University of Windsor (Canada), 2006

    Convolutional Neural Network in Pattern Recognition

    Get PDF
    Since convolutional neural network (CNN) was first implemented by Yann LeCun et al. in 1989, CNN and its variants have been widely implemented to numerous topics of pattern recognition, and have been considered as the most crucial techniques in the field of artificial intelligence and computer vision. This dissertation not only demonstrates the implementation aspect of CNN, but also lays emphasis on the methodology of neural network (NN) based classifier. As known to many, one general pipeline of NN-based classifier can be recognized as three stages: pre-processing, inference by models, and post-processing. To demonstrate the importance of pre-processing techniques, this dissertation presents how to model actual problems in medical pattern recognition and image processing by introducing conceptual abstraction and fuzzification. In particular, a transformer on the basis of self-attention mechanism, namely beat-rhythm transformer, greatly benefits from correct R-peak detection results and conceptual fuzzification. Recently proposed self-attention mechanism has been proven to be the top performer in the fields of computer vision and natural language processing. In spite of the pleasant accuracy and precision it has gained, it usually consumes huge computational resources to perform self-attention. Therefore, realtime global attention network is proposed to make a better trade-off between efficiency and performance for the task of image segmentation. To illustrate more on the stage of inference, we also propose models to detect polyps via Faster R-CNN - one of the most popular CNN-based 2D detectors, as well as a 3D object detection pipeline for regressing 3D bounding boxes from LiDAR points and stereo image pairs powered by CNN. The goal for post-processing stage is to refine artifacts inferred by models. For the semantic segmentation task, the dilated continuous random field is proposed to be better fitted to CNN-based models than the widely implemented fully-connected continuous random field. Proposed approaches can be further integrated into a reinforcement learning architecture for robotics

    Framework of hierarchy for neural theory

    Get PDF

    Joint Goal Human Robot collaboration-From Remembering to Inferring

    Get PDF
    The ability to infer goals, consequences of one’s own and others’ actions is a critical desirable feature for robots to truly become our companions-thereby opening up applications in several domains. This article proposes the viewpoint that the ability to remember our own past experiences based on present context enables us to infer future consequences of both our actions/goals and observed actions/goals of the other (by analogy). In this context, a biomimetic episodic memory architecture to encode diverse learning experiences of iCub humanoid is presented. The critical feature is that partial cues from the present environment like objects perceived or observed actions of a human triggers a recall of context relevant past experiences thereby enabling the robot to infer rewarding future states and engage in cooperative goal-oriented behaviors. An assembly task jointly done by human and the iCub humanoid is used to illustrate the framework. Link between the proposed framework and emerging results from neurosciences related to shared cortical basis for ‘remembering, imagining and perspective taking’ is discussed

    Attractors, memory and perception

    Get PDF
    In this Thesis, the first three introductory chapters are devoted to the review of literature on contextual perception, its neural basis and network modeling of memory. In chapter 4, the first two sections give the definition of our model; and the next two sections, 4.3 and 4.4, report the original work of mine on retrieval properties of different network structures and network dynamics underlying the response to ambiguous patterns, respectively. The reported work in chapter 5 has been done in collaboration with Prof Bharathi Jagadeesh in University of Washington, and is already published in the journal \u201dCerebral Cortex\u201d. In this collaboration, Yan Liu, from the group in Seattle, carried out the recording experiments and I did the data analysis and network simulations. Chapter 6, which represents a network model for \u201dpriming\u201d and \u201dadaptation aftereffect\u201d is done by me. The works reported in 4.3, 4.5, and the whole chapter 6 are in preparation for publication

    Pertanika Journal of Science & Technology

    Get PDF

    Pertanika Journal of Science & Technology

    Get PDF

    Complex Neural Networks for Audio

    Get PDF
    Audio is represented in two mathematically equivalent ways: the real-valued time domain (i.e., waveform) and the complex-valued frequency domain (i.e., spectrum). There are advantages to the frequency-domain representation, e.g., the human auditory system is known to process sound in the frequency-domain. Furthermore, linear time-invariant systems are convolved with sources in the time-domain, whereas they may be factorized in the frequency-domain. Neural networks have become rather useful when applied to audio tasks such as machine listening and audio synthesis, which are related by their dependencies on high quality acoustic models. They ideally encapsulate fine-scale temporal structure, such as that encoded in the phase of frequency-domain audio, yet there are no authoritative deep learning methods for complex audio. This manuscript is dedicated to addressing the shortcoming. Chapter 2 motivates complex networks by their affinity with complex-domain audio, while Chapter 3 contributes methods for building and optimizing complex networks. We show that the naive implementation of Adam optimization is incorrect for complex random variables and show that selection of input and output representation has a significant impact on the performance of a complex network. Experimental results with novel complex neural architectures are provided in the second half of this manuscript. Chapter 4 introduces a complex model for binaural audio source localization. We show that, like humans, the complex model can generalize to different anatomical filters, which is important in the context of machine listening. The complex model\u27s performance is better than that of the real-valued models, as well as real- and complex-valued baselines. Chapter 5 proposes a two-stage method for speech enhancement. In the first stage, a complex-valued stochastic autoencoder projects complex vectors to a discrete space. In the second stage, long-term temporal dependencies are modeled in the discrete space. The autoencoder raises the performance ceiling for state of the art speech enhancement, but the dynamic enhancement model does not outperform other baselines. We discuss areas for improvement and note that the complex Adam optimizer improves training convergence over the naive implementation

    Técnicas de visión por computador para la detección del verdor y la detección de obstáculos en campos de maíz

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, leída el 22/06/2017There is an increasing demand in the use of Computer Vision techniques in Precision Agriculture (PA) based on images captured with cameras on-board autonomous vehicles. Two techniques have been developed in this research. The rst for greenness identi cation and the second for obstacle detection in maize elds, including people and animals, for tractors in the RHEA (robot eets for highly e ective and forestry management) project, equipped with monocular cameras on-board the tractors. For vegetation identi cation in agricultural images the combination of colour vegetation indices (CVIs) with thresholding techniques is the usual strategy where the remaining elements on the image are also extracted. The main goal of this research line is the development of an alternative strategy for vegetation detection. To achieve our goal, we propose a methodology based on two well-known techniques in computer vision: Bag of Words representation (BoW) and Support Vector Machines (SVM). Then, each image is partitioned into several Regions Of Interest (ROIs). Afterwards, a feature descriptor is obtained for each ROI, then the descriptor is evaluated with a classi er model (previously trained to discriminate between vegetation and background) to determine whether or not the ROI is vegetation...Cada vez existe mayor demanda en el uso de t ecnicas de Visi on por Computador en Agricultura de Precisi on mediante el procesamiento de im agenes captadas por c amaras instaladas en veh culos aut onomos. En este trabajo de investigaci on se han desarrollado dos tipos de t ecnicas. Una para la identi caci on de plantas verdes y otra para la detecci on de obst aculos en campos de ma z, incluyendo personas y animales, para tractores del proyecto RHEA. El objetivo nal de los veh culos aut onomos fue la identi caci on y eliminaci on de malas hierbas en los campos de ma z. En im agenes agr colas la vegetaci on se detecta generalmente mediante ndices de vegetaci on y m etodos de umbralizaci on. Los ndices se calculan a partir de las propiedades espectrales en las im agenes de color. En esta tesis se propone un nuevo m etodo con tal n, lo que constituye un objetivo primordial de la investigaci on. La propuesta se basa en una estrategia conocida como \bolsa de palabras" conjuntamente con un modelo se aprendizaje supervisado. Ambas t ecnicas son ampliamente utilizadas en reconocimiento y clasi caci on de im agenes. La imagen se divide inicialmente en regiones homog eneas o de inter es (RIs). Dada una colecci on de RIs, obtenida de un conjunto de im agenes agr colas, se calculan sus caracter sticas locales que se agrupan por su similitud. Cada grupo representa una \palabra visual", y el conjunto de palabras visuales encontradas forman un \diccionario visual". Cada RI se representa por un conjunto de palabras visuales las cuales se cuanti can de acuerdo a su ocurrencia dentro de la regi on obteniendo as un vector-c odigo o \codebook", que es descriptor de la RI. Finalmente, se usan las M aquinas de Vectores Soporte para evaluar los vectores-c odigo y as , discriminar entre RIs que son vegetaci on del resto...Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu