13 research outputs found
A low complexity image compression algorithm for Bayer color filter array
Digital image in their raw form requires an excessive amount of storage capacity. Image compression is a process of reducing the cost of storage and transmission of image data. The compression algorithm reduces the file size so that it requires less storage or transmission bandwidth. This work presents a new color transformation and compression algorithm for the Bayer color filter array (CFA) images. In a full color image, each pixel contains R, G, and B components. A CFA image contains single channel information in each pixel position, demosaicking is required to construct a full color image. For each pixel, demosaicking constructs the missing two-color information by using information from neighbouring pixels. After demosaicking, each pixel contains R, G, and B information, and a full color image is constructed. Conventional CFA compression occurs after the demosaicking. However, the Bayer CFA image can be compressed before demosaicking which is called compression-first method, and the algorithm proposed in this research follows the compression-first or direct compression method. The compression-first method applies the compression algorithm directly onto the CFA data and shifts demosaicking to the other end of the transmission and storage process. The advantage of the compression-first method is that it requires three time less transmission bandwidth for each pixel than conventional compression.
Compression-first method of CFA data produces spatial redundancy, artifacts, and false high frequencies. The process requires a color transformation with less correlation among the color components than that Bayer RGB color space. This work analyzes correlation coefficient, standard deviation, entropy, and intensity range of the Bayer RGB color components. The analysis provides two efficient color transformations in terms of features of color transformation. The proposed color components show lesser correlation coefficient than occurs with the Bayer RGB color components. Color transformations reduce both the spatial and spectral redundancies of the Bayer CFA image. After color transformation, the components are independently encoded using differential pulse-code modulation (DPCM) in raster order fashion. The residue error of DPCM is mapped to a positive integer for the adaptive Golomb rice code. The compression algorithm includes both the adaptive Golomb rice and Unary coding to generate bit stream. Extensive simulation analysis is performed on both simulated CFA and real CFA datasets. This analysis is extended for the WCE (wireless capsule endoscopic) images. The compression algorithm is also realized with a simulated WCE CFA dataset. The results show that the proposed algorithm requires less bits per pixel than the conventional CFA compression. The algorithm also outperforms recent works on CFA compression algorithms for both real and simulated CFA datasets
eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference
Convolutional neural networks (CNNs) have recently demonstrated superior
quality for computational imaging applications. Therefore, they have great
potential to revolutionize the image pipelines on cameras and displays.
However, it is difficult for conventional CNN accelerators to support
ultra-high-resolution videos at the edge due to their considerable DRAM
bandwidth and power consumption. Therefore, finding a further memory- and
computation-efficient microarchitecture is crucial to speed up this coming
revolution.
In this paper, we approach this goal by considering the inference flow,
network model, instruction set, and processor design jointly to optimize
hardware performance and image quality. We apply a block-based inference flow
which can eliminate all the DRAM bandwidth for feature maps and accordingly
propose a hardware-oriented network model, ERNet, to optimize image quality
based on hardware constraints. Then we devise a coarse-grained instruction set
architecture, FBISA, to support power-hungry convolution by massive
parallelism. Finally,we implement an embedded processor---eCNN---which
accommodates to ERNet and FBISA with a flexible processing architecture. Layout
results show that it can support high-quality ERNets for super-resolution and
denoising at up to 4K Ultra-HD 30 fps while using only DDR-400 and consuming
6.94W on average. By comparison, the state-of-the-art Diffy uses dual-channel
DDR3-2133 and consumes 54.3W to support lower-quality VDSR at Full HD 30 fps.
Lastly, we will also present application examples of high-performance style
transfer and object recognition to demonstrate the flexibility of eCNN.Comment: 14 pages; appearing in IEEE/ACM International Symposium on
Microarchitecture (MICRO), 201
VLSI Design
This book provides some recent advances in design nanometer VLSI chips. The selected topics try to present some open problems and challenges with important topics ranging from design tools, new post-silicon devices, GPU-based parallel computing, emerging 3D integration, and antenna design. The book consists of two parts, with chapters such as: VLSI design for multi-sensor smart systems on a chip, Three-dimensional integrated circuits design for thousand-core processors, Parallel symbolic analysis of large analog circuits on GPU platforms, Algorithms for CAD tools VLSI design, A multilevel memetic algorithm for large SAT-encoded problems, etc
Towards the development of flexible, reliable, reconfigurable, and high-performance imaging systems
Current FPGAs can implement large systems because of the high density of
reconfigurable logic resources in a single chip. FPGAs are comprehensive devices
that combine flexibility and high performance in the same platform compared to
other platform such as General-Purpose Processors (GPPs) and Application Specific
Integrated Circuits (ASICs). The flexibility of modern FPGAs is further enhanced by
introducing Dynamic Partial Reconfiguration (DPR) feature, which allows for
changing the functionality of part of the system while other parts are functioning.
FPGAs became an important platform for digital image processing applications
because of the aforementioned features. They can fulfil the need of efficient and
flexible platforms that execute imaging tasks efficiently as well as the reliably with
low power, high performance and high flexibility. The use of FPGAs as accelerators
for image processing outperforms most of the current solutions. Current FPGA
solutions can to load part of the imaging application that needs high computational
power on dedicated reconfigurable hardware accelerators while other parts are
working on the traditional solution to increase the system performance. Moreover,
the use of the DPR feature enhances the flexibility of image processing further by
swapping accelerators in and out at run-time. The use of fault mitigation techniques
in FPGAs enables imaging applications to operate in harsh environments following
the fact that FPGAs are sensitive to radiation and extreme conditions.
The aim of this thesis is to present a platform for efficient implementations of
imaging tasks. The research uses FPGAs as the key component of this platform and
uses the concept of DPR to increase the performance, flexibility, to reduce the power
dissipation and to expand the cycle of possible imaging applications. In this context,
it proposes the use of FPGAs to accelerate the Image Processing Pipeline (IPP)
stages, the core part of most imaging devices. The thesis has a number of novel
concepts. The first novel concept is the use of FPGA hardware environment and
DPR feature to increase the parallelism and achieve high flexibility. The concept also
increases the performance and reduces the power consumption and area utilisation.
Based on this concept, the following implementations are presented in this thesis: An
implementation of Adams Hamilton Demosaicing algorithm for camera colour
interpolation, which exploits the FPGA parallelism to outperform other equivalents.
In addition, an implementation of Automatic White Balance (AWB), another IPP
stage that employs DPR feature to prove the mentioned novelty aspects. Another
novel concept in this thesis is presented in chapter 6, which uses DPR feature to
develop a novel flexible imaging system that requires less logic and can be
implemented in small FPGAs. The system can be employed as a template for any
imaging application with no limitation. Moreover, discussed in this thesis is a novel
reliable version of the imaging system that adopts novel techniques including
scrubbing, Built-In Self Test (BIST), and Triple Modular Redundancy (TMR) to
detect and correct errors using the Internal Configuration Access Port (ICAP)
primitive. These techniques exploit the datapath-based nature of the implemented
imaging system to improve the system's overall reliability. The thesis presents a
proposal for integrating the imaging system with the Robust Reliable Reconfigurable
Real-Time Heterogeneous Operating System (R4THOS) to get the best out of the
system. The proposal shows the suitability of the proposed DPR imaging system to
be used as part of the core system of autonomous cars because of its unbounded
flexibility. These novel works are presented in a number of publications as shown in section
1.3 later in this thesis
Sensor Signal and Information Processing II
In the current age of information explosion, newly invented technological sensors and software are now tightly integrated with our everyday lives. Many sensor processing algorithms have incorporated some forms of computational intelligence as part of their core framework in problem solving. These algorithms have the capacity to generalize and discover knowledge for themselves and learn new information whenever unseen data are captured. The primary aim of sensor processing is to develop techniques to interpret, understand, and act on information contained in the data. The interest of this book is in developing intelligent signal processing in order to pave the way for smart sensors. This involves mathematical advancement of nonlinear signal processing theory and its applications that extend far beyond traditional techniques. It bridges the boundary between theory and application, developing novel theoretically inspired methodologies targeting both longstanding and emergent signal processing applications. The topic ranges from phishing detection to integration of terrestrial laser scanning, and from fault diagnosis to bio-inspiring filtering. The book will appeal to established practitioners, along with researchers and students in the emerging field of smart sensors processing
Dynamically reconfigurable asynchronous processor
The main design requirements for today's mobile applications are:
· high throughput performance.
· high energy efficiency.
· high programmability.
Until now, the choice of platform has often been limited to Application-Specific
Integrated Circuits (ASICs), due to their best-of-breed performance and power
consumption. The economies of scale possible with these high-volume markets have
traditionally been able to hide the high Non-Recurring Engineering (NRE) costs
required for designing and fabricating new ASICs. However, with the NREs and
design time escalating with each generation of mobile applications, this practice may
be reaching its limit.
Designers today are looking at programmable solutions, so that they can respond
more rapidly to changes in the market and spread costs over several generations of
mobile applications. However, there have been few feasible alternatives to ASICs:
Digital Signals Processors (DSPs) and microprocessors cannot meet the throughput
requirements, whereas Field-Programmable Gate Arrays (FPGAs) require too much
area and power.
Coarse-grained dynamically reconfigurable architectures offer better solutions for
high throughput applications, when power and area considerations are taken into
account. One promising example is the Reconfigurable Instruction Cell Array
(RICA). RICA consists of an array of cells with an interconnect that can be
dynamically reconfigured on every cycle. This allows quite complex datapaths to be
rendered onto the fabric and executed in a single configuration - making these
architectures particularly suitable to stream processing. Furthermore, RICA can be
programmed from C, making it a good fit with existing design methodologies.
However the RICA architecture has a drawback: poor scalability in terms of area and
power. As the core gets bigger, the number of sequential elements in the array must
be increased significantly to maintain the ability to achieve high throughputs through
pipelining. As a result, a larger clock tree is required to synchronise the increased
number of sequential elements. The clock tree therefore takes up a larger percentage
of the area and power consumption of the core.
This thesis presents a novel Dynamically Reconfigurable Asynchronous Processor
(DRAP), aimed at high-throughput mobile applications. DRAP is based on the RICA
architecture, but uses asynchronous design techniques - methods of designing digital
systems without clocks. The absence of a global clock signal makes DRAP more
scalable in terms of power and area overhead than its synchronous counterpart.
The DRAP architecture maintains most of the benefits of custom asynchronous
design, whilst also providing programmability via conventional high-level languages.
Results show that the DRAP processor delivers considerably lower power
consumption when compared to a market-leading Very Long Instruction Word
(VLIW) processor and a low-power ARM processor. For example, DRAP resulted in
a reduction in power consumption of 20 times compared to the ARM7 processor, and
29 times compared to the TIC64x VLIW, when running the same benchmark capped
to the same throughput and for the same process technology (0.13μm). When
compared to an equivalent RICA design, DRAP was up to 22% larger than RICA but
resulted in a power reduction of up to 1.9 times. It was also capable of achieving up
to 2.8 times higher throughputs than RICA for the same benchmarks
Algorithms for the enhancement of dynamic range and colour constancy of digital images & video
One of the main objectives in digital imaging is to mimic the capabilities of the human eye, and perhaps, go beyond in certain aspects. However, the human visual system is so versatile, complex, and only partially understood that no up-to-date imaging technology has been able to accurately reproduce the capabilities of the it. The extraordinary capabilities of the human eye have become a crucial shortcoming in digital imaging, since digital photography, video recording, and computer vision applications have continued to demand more realistic and accurate imaging reproduction and analytic capabilities.
Over decades, researchers have tried to solve the colour constancy problem, as well as extending the dynamic range of digital imaging devices by proposing a number of algorithms and instrumentation approaches. Nevertheless, no unique solution has been identified; this is partially due to the wide range of computer vision applications that require colour constancy and high dynamic range imaging, and the complexity of the human visual system to achieve effective colour constancy and dynamic range capabilities.
The aim of the research presented in this thesis is to enhance the overall image quality within an image signal processor of digital cameras by achieving colour constancy and extending dynamic range capabilities. This is achieved by developing a set of advanced image-processing algorithms that are robust to a number of practical challenges and feasible to be implemented within an image signal processor used in consumer electronics imaging devises.
The experiments conducted in this research show that the proposed algorithms supersede state-of-the-art methods in the fields of dynamic range and colour constancy. Moreover, this unique set of image processing algorithms show that if they are used within an image signal processor, they enable digital camera devices to mimic the human visual system s dynamic range and colour constancy capabilities; the ultimate goal of any state-of-the-art technique, or commercial imaging device
Propuesta de arquitectura y circuitos para la mejora del rango dinámico de sistemas de visión en un chip diseñados en tecnologías CMOS profundamente submicrométrica
El trabajo presentado en esta tesis trata de proponer nuevas técnicas para la expansión
del rango dinámico en sensores electrónicos de imagen. En este caso, hemos dirigido nuestros
estudios hacia la posibilidad de proveer dicha funcionalidad en un solo chip. Esto es, sin
necesitar ningún soporte externo de hardware o software, formando un tipo de sistema
denominado Sistema de Visión en un Chip (VSoC). El rango dinámico de los sensores
electrónicos de imagen se define como el cociente entre la máxima y la mínima iluminación
medible. Para mejorar este factor surgen dos opciones. La primera, reducir la mínima luz
medible mediante la disminución del ruido en el sensor de imagen. La segunda, incrementar la
máxima luz medible mediante la extensión del límite de saturación del sensor.
Cronológicamente, nuestra primera opción para mejorar el rango dinámico se basó en
reducir el ruido. Varias opciones se pueden tomar para mejorar la figura de mérito de ruido del
sistema: reducir el ruido usando una tecnología CIS o usar circuitos dedicados, tales como
calibración o auto cero. Sin embargo, el uso de técnicas de circuitos implica limitaciones, las
cuales sólo pueden ser resueltas mediante el uso de tecnologías no estándar que están
especialmente diseñadas para este propósito. La tecnología CIS utilizada está dirigida a la
mejora de la calidad y las posibilidades del proceso de fotosensado, tales como sensibilidad,
ruido, permitir imagen a color, etcétera. Para estudiar las características de la tecnología en más
detalle, se diseñó un chip de test, lo cual permite extraer las mejores opciones para futuros
píxeles. No obstante, a pesar de un satisfactorio comportamiento general, las medidas referentes
al rango dinámico indicaron que la mejora de este mediante sólo tecnología CIS es muy
limitada. Es decir, la mejora de la corriente oscura del sensor no es suficiente para nuestro
propósito. Para una mayor mejora del rango dinámico se deben incluir circuitos dentro del píxel.
No obstante, las tecnologías CIS usualmente no permiten nada más que transistores NMOS al
lado del fotosensor, lo cual implica una seria restricción en el circuito a usar. Como resultado, el
diseño de un sensor de imagen con mejora del rango dinámico en tecnologías CIS fue
desestimado en favor del uso de una tecnología estándar, la cual da más flexibilidad al diseño
del píxel.
En tecnologías estándar, es posible introducir una alta funcionalidad usando circuitos
dentro del píxel, lo cual permite técnicas avanzadas para extender el límite de saturación de los
sensores de imagen. Para este objetivo surgen dos opciones: adquisición lineal o compresiva. Si
se realiza una adquisición lineal, se generarán una gran cantidad de datos por cada píxel. Como
ejemplo, si el rango dinámico de la escena es de 120dB al menos se necesitarían 20-bits/píxel,
log2(10120/20)=19.93, para la representación binaria de este rango dinámico. Esto necesitaría de
amplios recursos para procesar esta gran cantidad de datos, y un gran ancho de banda para
moverlos al circuito de procesamiento. Para evitar estos problemas, los sensores de imagen de
alto rango dinámico usualmente optan por utilizar una adquisición compresiva de la luz. Por lo
tanto, esto implica dos tareas a realizar: la captura y la compresión de la imagen. La captura de
la imagen se realiza a nivel de píxel, en el dispositivo fotosensor, mientras que la compresión de
la imagen puede ser realizada a nivel de píxel, de sistema, o mediante postprocesado externo.
Usando el postprocesado, existe un campo de investigación que estudia la compresión de
escenas de alto rango dinámico mientras se mantienen los detalles, produciendo un resultado
apropiado para la percepción humana en monitores convencionales de bajo rango dinámico.
Esto se denomina Mapeo de Tonos (Tone Mapping) y usualmente emplea solo 8-bits/píxel para
las representaciones de imágenes, ya que éste es el estándar para las imágenes de bajo rango
dinámico.
Los píxeles de adquisición compresiva, por su parte, realizan una compresión que no es
dependiente de la escena de alto rango dinámico a capturar, lo cual implica una baja compresión
o pérdida de detalles y contraste. Para evitar estas desventajas, en este trabajo, se presenta un
píxel de adquisición compresiva que aplica una técnica de mapeo de tonos que permite la
captura de imágenes ya comprimidas de una forma optimizada para mantener los detalles y el
contraste, produciendo una cantidad muy reducida de datos. Las técnicas de mapeo de tonos
ejecutan normalmente postprocesamiento mediante software en un ordenador sobre imágenes
capturadas sin compresión, las cuales contienen una gran cantidad de datos. Estas técnicas han
pertenecido tradicionalmente al campo de los gráficos por ordenador debido a la gran cantidad
de esfuerzo computacional que requieren. Sin embargo, hemos desarrollado un nuevo algoritmo
de mapeo de tonos especialmente adaptado para aprovechar los circuitos dentro del píxel y que
requiere un reducido esfuerzo de computación fuera de la matriz de píxeles, lo cual permite el
desarrollo de un sistema de visión en un solo chip. El nuevo algoritmo de mapeo de tonos, el
cual es un concepto matemático que puede ser simulado mediante software, se ha implementado
también en un chip. Sin embargo, para esta implementación hardware en un chip son necesarias
algunas adaptaciones y técnicas avanzadas de diseño, que constituyen en sí mismas otra de las
contribuciones de este trabajo. Más aún, debido a la nueva funcionalidad, se han desarrollado
modificaciones de los típicos métodos a usar para la caracterización y captura de imágenes
Neuromorphic perception for greenhouse technology using event-based sensors
Event-Based Cameras (EBCs), unlike conventional cameras, feature independent pixels that asynchronously generate outputs upon detecting changes in their field of view. Short calculations are performed on each event to mimic the brain. The output is a sparse sequence of events with high temporal precision. Conventional computer vision algorithms do not leverage these properties. Thus a new paradigm has been devised. While event cameras are very efficient in representing sparse sequences of events with high temporal precision, many approaches are challenged in applications where a large amount of spatially-temporally rich information must be processed in real-time. In reality, most tasks in everyday life take place in complex and uncontrollable environments, which require sophisticated models and intelligent reasoning. Typical hard problems in real-world scenes are detecting various non-uniform objects or navigation in an unknown and complex environment. In addition, colour perception is an essential fundamental property in distinguishing objects in natural scenes. Colour is a new aspect of event-based sensors, which work fundamentally differently from standard cameras, measuring per-pixel brightness changes per colour filter asynchronously rather than measuring “absolute” brightness at a constant rate. This thesis explores neuromorphic event-based processing methods for high-noise and cluttered environments with imbalanced classes. A fully event-driven processing pipeline was developed for agricultural applications to perform fruits detection and classification to unlock the outstanding properties of event cameras. The nature of features in such data was explored, and methods to represent and detect features were demonstrated. A framework for detecting and classifying features was developed and evaluated on the N-MNIST and Dynamic Vision Sensor (DVS) gesture datasets. The same network was evaluated on laboratory recorded and real-world data with various internal variations for fruits detection such as overlap, variation in size and appearance. In addition, a method to handle highly imbalanced data was developed. We examined the characteristics of spatio-temporal patterns for each colour filter to help expand our understanding of this novel data and explored their applications in classification tasks where colours were more relevant features than shapes and appearances. The results presented in this thesis demonstrate the potential and efficacy of event- based systems by demonstrating the applicability of colour event data and the viability of event-driven classification