1,174 research outputs found

    An overview of artificial intelligence and robotics. Volume 1: Artificial intelligence. Part B: Applications

    Get PDF
    Artificial Intelligence (AI) is an emerging technology that has recently attracted considerable attention. Many applications are now under development. This report, Part B of a three part report on AI, presents overviews of the key application areas: Expert Systems, Computer Vision, Natural Language Processing, Speech Interfaces, and Problem Solving and Planning. The basic approaches to such systems, the state-of-the-art, existing systems and future trends and expectations are covered

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    Toward color image segmentation in analog VLSI: Algorithm and hardware

    Get PDF
    Standard techniques for segmenting color images are based on finding normalized RGB discontinuities, color histogramming, or clustering techniques in RGB or CIE color spaces. The use of the psychophysical variable hue in HSI space has not been popular due to its numerical instability at low saturations. In this article, we propose the use of a simplified hue description suitable for implementation in analog VLSI. We demonstrate that if theintegrated white condition holds, hue is invariant to certain types of highlights, shading, and shadows. This is due to theadditive/shift invariance property, a property that other color variables lack. The more restrictive uniformly varying lighting model associated with themultiplicative/scale invariance property shared by both hue and normalized RGB allows invariance to transparencies, and to simple models of shading and shadows. Using binary hue discontinuities in conjunction with first-order type of surface interpolation, we demonstrate these invariant properties and compare them against the performance of RGB, normalized RGB, and CIE color spaces. We argue that working in HSI space offers an effective method for segmenting scenes in the presence of confounding cues due to shading, transparency, highlights, and shadows. Based on this work, we designed and fabricated for the first time an analog CMOS VLSI circuit with on-board phototransistor input that computes normalized color and hue

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    FASTER: Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration

    Get PDF
    The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) EU FP7 project, aims to ease the design and implementation of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving high performance and extending product functionality and lifetime via the addition of new features that operate at hardware speed. However, designing a changing hardware system is both challenging and time-consuming. FASTER facilitates the use of reconfigurable technology by providing a complete methodology enabling designers to easily specify, analyze, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. Our tool-chain supports both coarse- and fine-grain FPGA reconfiguration, while during execution a flexible run-time system manages the reconfigurable resources. We target three applications from different domains. We explore the way each application benefits from reconfiguration, and then we asses them and the FASTER tools, in terms of performance, area consumption and accuracy of analysis

    Image Understanding by Hierarchical Symbolic Representation and Inexact Matching of Attributed Graphs

    Get PDF
    We study the symbolic representation of imagery information by a powerful global representation scheme in the form of Attributed Relational Graph (ARG), and propose new techniques for the extraction of such representation from spatial-domain images, and for performing the task of image understanding through the analysis of the extracted ARG representation. To achieve practical image understanding tasks, the system needs to comprehend the imagery information in a global form. Therefore, we propose a multi-layer hierarchical scheme for the extraction of global symbolic representation from spatial-domain images. The proposed scheme produces a symbolic mapping of the input data in terms of an output alphabet, whose elements are defined over global subimages. The proposed scheme uses a combination of model-driven and data-driven concepts. The model- driven principle is represented by a graph transducer, which is used to specify the alphabet at each layer in the scheme. A symbolic mapping is driven by the input data to map the input local alphabet into the output global alphabet. Through the iterative application of the symbolic transformational mapping at different levels of hierarchy, the system extracts a global representation from the image in the form of attributed relational graphs. Further processing and interpretation of the imagery information can, then, be performed on their ARG representation. We also propose an efficient approach for calculating a distance measure and finding the best inexact matching configuration between attributed relational graphs. For two ARGs, we define sequences of weighted error-transformations which when performed on one ARG (or a subgraph of it), will produce the other ARG. A distance measure between two ARGs is defined as the weight of the sequence which possesses minimum total-weight. Moreover, this minimum-total weight sequence defines the best inexact matching configuration between the two ARGs. The global minimization over the possible sequences is performed by a dynamic programming technique, the approach shows good results for ARGs of practical sizes. The proposed system possesses the capability to inference the alphabets of the ARG representation which it uses. In the inference phase, the hierarchical scheme is usually driven by the input data only, which normally consist of images of model objects. It extracts the global alphabet of the ARG representation of the models. The extracted model representation is then used in the operation phase of the system to: perform the mapping in the multi-layer scheme. We present our experimental results for utilizing the proposed system for locating objects in complex scenes

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
    corecore