1,017 research outputs found

    Processeurs embarqués configurables pour la reproduction de tons

    Get PDF
    RÉSUMÉ Les images à grande gamme dynamique (HDR) peuvent capturer les détails d’une scène à la fois dans les zones les plus claires et les zones ombragées, en imitant les capacités du système visuel humain. La reproduction de tons (TM) vise à adapter les images HDR aux dispositifs d’affichage traditionnels. La première partie de ce travail s’occupe d’une application des algorithmes de reproduction de tons : l’amélioration du contraste. Nous avons effectué une comparaison de plusieurs méthodes de pointe d’ajustement du contraste, y compris deux opérateurs de TM. Cette analyse comparative a été mise en oeuvre dans le contexte d’applications de surveillance lorsque les vidéos sont prises dans des conditions d’éclairage faibles. La qualité de l’image a été évaluée en utilisant des métriques objectives comme le contraste d’intensités et l’erreur de la brillance, et via une évaluation subjective. De plus, la performance a été mesurée en fonction du temps d’exécution. Les résultats expérimentaux montrent qu’une technique récente basée sur une modification de l’histogramme présente un meilleur compromis si les deux critères sont considérés. Les algorithmes de TM imposent habituellement des besoins élevés en ressources de calcul. En conséquence, ces algorithmes sont normalement implémentés sur des processeurs à usage général puissants et des processeurs graphiques. Ces plateformes ne peuvent pas toujours satisfaire les contraintes de performance, de surface, de consommation de puissance et de flexibilité imposées par le domaine des systèmes embarqués. Même si ces exigences sont souvent contradictoires, les processeurs à jeu d’instructions spécialisées (ASIP) deviennent une alternative d’implémentation intéressante. Les ASIP peuvent fournir un compromis entre l’efficacité d’une solution matérielle dédiée et la flexibilité associée à une solution logicielle programmable. La deuxième partie de ce mémoire présente la conception et l’implémentation d’un processeur spécialisé pour un algorithme global de TM. Nous avons analysé l’algorithme entier afin d’estimer les besoins en données et en calculs. Trois instructions spécialisées ont été proposées : pour calculer les valeurs de la luminance, du logarithme et de la luminance maximale. En utilisant un langage de description architecturale, les instructions spécialisées ont été ajoutées à un processeur similaire à un RISC de 32 bits. Le logarithme a été calculé à l’aide d’une technique spécifique à faible coût basée sur une approximation de Mitchell améliorée. Les résultats expérimentaux démontrent une augmentation de la performance de 169% si les trois instructions y sont rajoutées, avec un coût matériel supplémentaire de seulement 22%. Finalement, comme les algorithmes globaux de TM peuvent ne pas préserver d’importants contrastes locaux, nous avons conçu et implémenté un autre ASIP pour un algorithme local. Des instructions spécialisées pour accélérer une pyramide gaussienne modifiée ont été ajoutées à un processeur configurable et extensible, semblable à un RISC de 32 bits. Les différents niveaux de la pyramide ont été calculés en utilisant un noyau gaussien 2D unique dans un processus itératif. Les résultats montrent un facteur d’accélération de 12,3× pour le calcul de la pyramide, ce qui implique une amélioration de la performance de 50% pour l’algorithme local. Ce processeur spécialisé requiert une augmentation de la surface de 19% par rapport à la configuration de base. ---------ABSTRACT High dynamic range (HDR) images can capture the details of a scene in both highlights and shadows, imitating the capabilities of the human visual system. Tone mapping (TM) aims to adapt HDR images to conventional display devices. The first part of this work deals with an application of tone mapping algorithms: contrast enhancement. We compare several state-of-the-art contrast adjustment methods, including two TM operators. This comparative analysis was conducted in the context of surveillance applications when videos are taken in poor lighting conditions. Image quality was evaluated by means of objective metrics such as intensity contrast and brightness error, and by subjective assessment. Moreover, performance was measured based on execution time. Experimental results show that a recent technique based on histogram modification presents a better trade-off considering both aspects. TM algorithms usually impose high demands on computational resources. As a result, they are usually implemented on powerful general purpose processors and graphics processing units. Such platforms may not meet performance, area, power consumption and flexibility constraints imposed by the embedded system domain. These requirements are often contradictory, and application-specific instruction-set processors (ASIPs) become an interesting implementation alternative. ASIPs can provide a trade-off between the efficiency of a dedicated hardware solution and the flexibility associated with a software programmable solution. The second part of this master thesis presents the design and implementation of a customized processor for a global TM algorithm. We analyzed the whole algorithm to estimate the data and computational requirements. Three custom instructions were proposed: to calculate luminance, logarithm and maximum luminance values. Using an architecture description language, the custom instructions were added to a 32-bit RISC-based processor. The logarithm was computed using a specific low cost technique based on an improved Mitchell approximation. Experimental results demonstrate a 169% performance improvement when adding all three instructions, with a hardware overhead of only 22%. Finally, as global TM algorithms may not preserve important local contrasts, we designed and implemented another ASIP for a local algorithm. Custom instructions to accelerate a modified Gaussian pyramid were added to a configurable and extensible 32-bit RISC-like processor. The different pyramid levels were computed using a unique 2D Gaussian kernel in an iterative process. Results show a speedup factor of 12,3× for the pyramid computation, which implies a 50% performance improvement for the local algorithm. This customized processor requires a 19% area increase compared to the base configuration

    Accuracy-Guaranteed Fixed-Point Optimization in Hardware Synthesis and Processor Customization

    Get PDF
    RÉSUMÉ De nos jours, le calcul avec des nombres fractionnaires est essentiel dans une vaste gamme d’applications de traitement de signal et d’image. Pour le calcul numérique, un nombre fractionnaire peut être représenté à l’aide de l’arithmétique en virgule fixe ou en virgule flottante. L’arithmétique en virgule fixe est largement considérée préférable à celle en virgule flottante pour les architectures matérielles dédiées en raison de sa plus faible complexité d’implémentation. Dans la mise en œuvre du matériel, la largeur de mot attribuée à différents signaux a un impact significatif sur des métriques telles que les ressources (transistors), la vitesse et la consommation d'énergie. L'optimisation de longueur de mot (WLO) en virgule fixe est un domaine de recherche bien connu qui vise à optimiser les chemins de données par l'ajustement des longueurs de mots attribuées aux signaux. Un nombre en virgule fixe est composé d’une partie entière et d’une partie fractionnaire. Il y a une limite inférieure au nombre de bits alloués à la partie entière, de façon à prévenir les débordements pour chaque signal. Cette limite dépend de la gamme de valeurs que peut prendre le signal. Le nombre de bits de la partie fractionnaire, quant à lui, détermine la taille de l'erreur de précision finie qui est introduite dans les calculs. Il existe un compromis entre la précision et l'efficacité du matériel dans la sélection du nombre de bits de la partie fractionnaire. Le processus d'attribution du nombre de bits de la partie fractionnaire comporte deux procédures importantes: la modélisation de l'erreur de quantification et la sélection de la taille de la partie fractionnaire. Les travaux existants sur la WLO ont porté sur des circuits spécialisés comme plate-forme cible. Dans cette thèse, nous introduisons de nouvelles méthodologies, techniques et algorithmes pour améliorer l’implémentation de calculs en virgule fixe dans des circuits et processeurs spécialisés. La thèse propose une approche améliorée de modélisation d’erreur, basée sur l'arithmétique affine, qui aborde certains problèmes des méthodes existantes et améliore leur précision. La thèse introduit également une technique d'accélération et deux algorithmes semi-analytiques pour la sélection de la largeur de la partie fractionnaire pour la conception de circuits spécialisés. Alors que le premier algorithme suit une stratégie de recherche progressive, le second utilise une méthode de recherche en forme d'arbre pour l'optimisation de la largeur fractionnaire. Les algorithmes offrent deux options de compromis entre la complexité de calcul et le coût résultant. Le premier algorithme a une complexité polynomiale et obtient des résultats comparables avec des approches heuristiques existantes. Le second algorithme a une complexité exponentielle, mais il donne des résultats quasi-optimaux par rapport à une recherche exhaustive. Cette thèse propose également une méthode pour combiner l'optimisation de la longueur des mots dans un contexte de conception de processeurs configurables. La largeur et la profondeur des blocs de registres et l'architecture des unités fonctionnelles sont les principaux objectifs ciblés par cette optimisation. Un nouvel algorithme d'optimisation a été développé pour trouver la meilleure combinaison de longueurs de mots et d'autres paramètres configurables dans la méthode proposée. Les exigences de précision, définies comme l'erreur pire cas, doivent être respectées par toute solution. Pour faciliter l'évaluation et la mise en œuvre des solutions retenues, un nouvel environnement de conception de processeur a également été développé. Cet environnement, qui est appelé PolyCuSP, supporte une large gamme de paramètres, y compris ceux qui sont nécessaires pour évaluer les solutions proposées par l'algorithme d'optimisation. L’environnement PolyCuSP soutient l’exploration rapide de l'espace de solution et la capacité de modéliser différents jeux d'instructions pour permettre des comparaisons efficaces.----------ABSTRACT Fixed-point arithmetic is broadly preferred to floating-point in hardware development due to the reduced hardware complexity of fixed-point circuits. In hardware implementation, the bitwidth allocated to the data elements has significant impact on efficiency metrics for the circuits including area usage, speed and power consumption. Fixed-point word-length optimization (WLO) is a well-known research area. It aims to optimize fixed-point computational circuits through the adjustment of the allocated bitwidths of their internal and output signals. A fixed-point number is composed of an integer part and a fractional part. There is a minimum number of bits for the integer part that guarantees overflow and underflow avoidance in each signal. This value depends on the range of values that the signal may take. The fractional word-length determines the amount of finite-precision error that is introduced in the computations. There is a trade-off between accuracy and hardware cost in fractional word-length selection. The process of allocating the fractional word-length requires two important procedures: finite-precision error modeling and fractional word-length selection. Existing works on WLO have focused on hardwired circuits as the target implementation platform. In this thesis, we introduce new methodologies, techniques and algorithms to improve the hardware realization of fixed-point computations in hardwired circuits and customizable processors. The thesis proposes an enhanced error modeling approach based on affine arithmetic that addresses some shortcomings of the existing methods and improves their accuracy. The thesis also introduces an acceleration technique and two semi-analytical fractional bitwidth selection algorithms for WLO in hardwired circuit design. While the first algorithm follows a progressive search strategy, the second one uses a tree-shaped search method for fractional width optimization. The algorithms offer two different time-complexity/cost efficiency trade-off options. The first algorithm has polynomial complexity and achieves comparable results with existing heuristic approaches. The second algorithm has exponential complexity but achieves near-optimal results compared to an exhaustive search. The thesis further proposes a method to combine word-length optimization with application-specific processor customization. The supported datatype word-length, the size of register-files and the architecture of the functional units are the main target objectives to be optimized. A new optimization algorithm is developed to find the best combination of word-length and other customizable parameters in the proposed method. Accuracy requirements, defined as the worst-case error bound, are the key consideration that must be met by any solution. To facilitate evaluation and implementation of the selected solutions, a new processor design environment was developed. This environment, which is called PolyCuSP, supports necessary customization flexibility to realize and evaluate the solutions given by the optimization algorithm. PolyCuSP supports rapid design space exploration and capability to model different instruction-set architectures to enable effective compari

    Programmable Image-Based Light Capture for Previsualization

    Get PDF
    Previsualization is a class of techniques for creating approximate previews of a movie sequence in order to visualize a scene prior to shooting it on the set. Often these techniques are used to convey the artistic direction of the story in terms of cinematic elements, such as camera movement, angle, lighting, dialogue, and character motion. Essentially, a movie director uses previsualization (previs) to convey movie visuals as he sees them in his minds-eye . Traditional methods for previs include hand-drawn sketches, Storyboards, scaled models, and photographs, which are created by artists to convey how a scene or character might look or move. A recent trend has been to use 3D graphics applications such as video game engines to perform previs, which is called 3D previs. This type of previs is generally used prior to shooting a scene in order to choreograph camera or character movements. To visualize a scene while being recorded on-set, directors and cinematographers use a technique called On-set previs, which provides a real-time view with little to no processing. Other types of previs, such as Technical previs, emphasize accurately capturing scene properties but lack any interactive manipulation and are usually employed by visual effects crews and not for cinematographers or directors. This dissertation\u27s focus is on creating a new method for interactive visualization that will automatically capture the on-set lighting and provide interactive manipulation of cinematic elements to facilitate the movie maker\u27s artistic expression, validate cinematic choices, and provide guidance to production crews. Our method will overcome the drawbacks of the all previous previs methods by combining photorealistic rendering with accurately captured scene details, which is interactively displayed on a mobile capture and rendering platform. This dissertation describes a new hardware and software previs framework that enables interactive visualization of on-set post-production elements. A three-tiered framework, which is the main contribution of this dissertation is; 1) a novel programmable camera architecture that provides programmability to low-level features and a visual programming interface, 2) new algorithms that analyzes and decomposes the scene photometrically, and 3) a previs interface that leverages the previous to perform interactive rendering and manipulation of the photometric and computer generated elements. For this dissertation we implemented a programmable camera with a novel visual programming interface. We developed the photometric theory and implementation of our novel relighting technique called Symmetric lighting, which can be used to relight a scene with multiple illuminants with respect to color, intensity and location on our programmable camera. We analyzed the performance of Symmetric lighting on synthetic and real scenes to evaluate the benefits and limitations with respect to the reflectance composition of the scene and the number and color of lights within the scene. We found that, since our method is based on a Lambertian reflectance assumption, our method works well under this assumption but that scenes with high amounts of specular reflections can have higher errors in terms of relighting accuracy and additional steps are required to mitigate this limitation. Also, scenes which contain lights whose colors are a too similar can lead to degenerate cases in terms of relighting. Despite these limitations, an important contribution of our work is that Symmetric lighting can also be leveraged as a solution for performing multi-illuminant white balancing and light color estimation within a scene with multiple illuminants without limits on the color range or number of lights. We compared our method to other white balance methods and show that our method is superior when at least one of the light colors is known a priori

    Penware Panther: An embedded computer system for real-time applications

    Full text link
    Embedded computer systems target on different tasks and serve in various environments. This thesis relates to the design of Panther, an embedded computer system for real-time applications. Panther, a product of Mobinetix System Company, is a transaction and signature capture terminal which is aimed at a paperless environment. This product possesses a variety of functions from electronically capturing signature for receipts, contracts or identification, to touch-screen communication for PIN entry, advertising and customer survey. The emphasis of the thesis is on design of the firmware module of Panther. Instead of using the traditional flow chart and state machine approaches, an Object Orientation (OO) modeling approach is taken, which improves problem domain abstraction as well as system\u27s stability in the presence of changes. The Unified Modeling Language (UML) is used throughout the design to express the constructs and relationships among them. This work contributes, mostly to the Panther firmware architecture design, firmware implementation of communication processor, interpreter, and command applications

    AEVUM: Personalized Health Monitoring System

    Get PDF
    Advancement in the field of sensors and other portable technologies have resulted in a bevy of health monitoring devices such as blue-tooth and Wi-Fi enabled weighing scales and wearables which help individuals monitor their personal health. This collected information provides a plethora of data points over intervals of time that a primary care physician can utilize to gain a holistic understanding of an individual’s health and provide a more effective and personalized treatment. A drawback of the existing health monitoring devices is that they are not integrated with the professional medical infrastructure. With the wealth of information collected, it is also not feasible for a physician to look through all the data to obtain relevant information or patterns from multiple health monitoring systems. Therefore, it would be beneficial to have a single platform of hardware devices to monitor and collect data and a software application to securely store the collected information, identify patterns for analysis, and summarize the data for the physician and the patient. The aim of this study was to design and develop an unobtrusive, user friendly system, Aevum, which would integrate technology, adapt itself to changes in consumer behavior and integrate with the existing healthcare infrastructure to help an individual monitor their health in a customized manner. Aevum is a multi-device system consisting of a smart, puck-shaped hardware product, a wristband and a software application available to the patient as well as the physician. In addition to monitoring vitals such as heart rate, blood pressure, body temperature and weight, Aevum can monitor environmental factors that affect an individual’s health and uses personalized metrics such as precise calorie intake and medication management to monitor health. This allows the user to personalize Aevum based on their health condition. Finally, Aevum identifies patterns of anomalies in the collected data and compiles the information which can be accessed by the physician to assist in their treatment

    High-performance hardware accelerators for image processing in space applications

    Get PDF
    Mars is a hard place to reach. While there have been many notable success stories in getting probes to the Red Planet, the historical record is full of bad news. The success rate for actually landing on the Martian surface is even worse, roughly 30%. This low success rate must be mainly credited to the Mars environment characteristics. In the Mars atmosphere strong winds frequently breath. This phenomena usually modifies the lander descending trajectory diverging it from the target one. Moreover, the Mars surface is not the best place where performing a safe land. It is pitched by many and close craters and huge stones, and characterized by huge mountains and hills (e.g., Olympus Mons is 648 km in diameter and 27 km tall). For these reasons a mission failure due to a landing in huge craters, on big stones or on part of the surface characterized by a high slope is highly probable. In the last years, all space agencies have increased their research efforts in order to enhance the success rate of Mars missions. In particular, the two hottest research topics are: the active debris removal and the guided landing on Mars. The former aims at finding new methods to remove space debris exploiting unmanned spacecrafts. These must be able to autonomously: detect a debris, analyses it, in order to extract its characteristics in terms of weight, speed and dimension, and, eventually, rendezvous with it. In order to perform these tasks, the spacecraft must have high vision capabilities. In other words, it must be able to take pictures and process them with very complex image processing algorithms in order to detect, track and analyse the debris. The latter aims at increasing the landing point precision (i.e., landing ellipse) on Mars. Future space-missions will increasingly adopt Video Based Navigation systems to assist the entry, descent and landing (EDL) phase of space modules (e.g., spacecrafts), enhancing the precision of automatic EDL navigation systems. For instance, recent space exploration missions, e.g., Spirity, Oppurtunity, and Curiosity, made use of an EDL procedure aiming at following a fixed and precomputed descending trajectory to reach a precise landing point. This approach guarantees a maximum landing point precision of 20 km. By comparing this data with the Mars environment characteristics, it is possible to understand how the mission failure probability still remains really high. A very challenging problem is to design an autonomous-guided EDL system able to even more reduce the landing ellipse, guaranteeing to avoid the landing in dangerous area of Mars surface (e.g., huge craters or big stones) that could lead to the mission failure. The autonomous behaviour of the system is mandatory since a manual driven approach is not feasible due to the distance between Earth and Mars. Since this distance varies from 56 to 100 million of km approximately due to the orbit eccentricity, even if a signal transmission at the light speed could be possible, in the best case the transmission time would be around 31 minutes, exceeding so the overall duration of the EDL phase. In both applications, algorithms must guarantee self-adaptability to the environmental conditions. Since the Mars (and in general the space) harsh conditions are difficult to be predicted at design time, these algorithms must be able to automatically tune the internal parameters depending on the current conditions. Moreover, real-time performances are another key factor. Since a software implementation of these computational intensive tasks cannot reach the required performances, these algorithms must be accelerated via hardware. For this reasons, this thesis presents my research work done on advanced image processing algorithms for space applications and the associated hardware accelerators. My research activity has been focused on both the algorithm and their hardware implementations. Concerning the first aspect, I mainly focused my research effort to integrate self-adaptability features in the existing algorithms. While concerning the second, I studied and validated a methodology to efficiently develop, verify and validate hardware components aimed at accelerating video-based applications. This approach allowed me to develop and test high performance hardware accelerators that strongly overcome the performances of the actual state-of-the-art implementations. The thesis is organized in four main chapters. Chapter 2 starts with a brief introduction about the story of digital image processing. The main content of this chapter is the description of space missions in which digital image processing has a key role. A major effort has been spent on the missions in which my research activity has a substantial impact. In particular, for these missions, this chapter deeply analizes and evaluates the state-of-the-art approaches and algorithms. Chapter 3 analyzes and compares the two technologies used to implement high performances hardware accelerators, i.e., Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs). Thanks to this information the reader may understand the main reasons behind the decision of space agencies to exploit FPGAs instead of ASICs for high-performance hardware accelerators in space missions, even if FPGAs are more sensible to Single Event Upsets (i.e., transient error induced on hardware component by alpha particles and solar radiation in space). Moreover, this chapter deeply describes the three available space-grade FPGA technologies (i.e., One-time Programmable, Flash-based, and SRAM-based), and the main fault-mitigation techniques against SEUs that are mandatory for employing space-grade FPGAs in actual missions. Chapter 4 describes one of the main contribution of my research work: a library of high-performance hardware accelerators for image processing in space applications. The basic idea behind this library is to offer to designers a set of validated hardware components able to strongly speed up the basic image processing operations commonly used in an image processing chain. In other words, these components can be directly used as elementary building blocks to easily create a complex image processing system, without wasting time in the debug and validation phase. This library groups the proposed hardware accelerators in IP-core families. The components contained in a same family share the same provided functionality and input/output interface. This harmonization in the I/O interface enables to substitute, inside a complex image processing system, components of the same family without requiring modifications to the system communication infrastructure. In addition to the analysis of the internal architecture of the proposed components, another important aspect of this chapter is the methodology used to develop, verify and validate the proposed high performance image processing hardware accelerators. This methodology involves the usage of different programming and hardware description languages in order to support the designer from the algorithm modelling up to the hardware implementation and validation. Chapter 5 presents the proposed complex image processing systems. In particular, it exploits a set of actual case studies, associated with the most recent space agency needs, to show how the hardware accelerator components can be assembled to build a complex image processing system. In addition to the hardware accelerators contained in the library, the described complex system embeds innovative ad-hoc hardware components and software routines able to provide high performance and self-adaptable image processing functionalities. To prove the benefits of the proposed methodology, each case study is concluded with a comparison with the current state-of-the-art implementations, highlighting the benefits in terms of performances and self-adaptability to the environmental conditions

    CORSE-81: The 1981 Conference on Remote Sensing Education

    Get PDF
    Summaries of the presentations and tutorial workshops addressing various strategies in remote sensing education are presented. Course design from different discipline perspectives, equipment requirements for image interpretation and processing, and the role of universities, private industry, and government agencies in the education process are covered

    Digital imaging technology assessment: Digital document storage project

    Get PDF
    An ongoing technical assessment and requirements definition project is examining the potential role of digital imaging technology at NASA's STI facility. The focus is on the basic components of imaging technology in today's marketplace as well as the components anticipated in the near future. Presented is a requirement specification for a prototype project, an initial examination of current image processing at the STI facility, and an initial summary of image processing projects at other sites. Operational imaging systems incorporate scanners, optical storage, high resolution monitors, processing nodes, magnetic storage, jukeboxes, specialized boards, optical character recognition gear, pixel addressable printers, communications, and complex software processes

    Establishing a Framework for the development of Multimodal Virtual Reality Interfaces with Applicability in Education and Clinical Practice

    Get PDF
    The development of Virtual Reality (VR) and Augmented Reality (AR) content with multiple sources of both input and output has led to countless contributions in a great many number of fields, among which medicine and education. Nevertheless, the actual process of integrating the existing VR/AR media and subsequently setting it to purpose is yet a highly scattered and esoteric undertaking. Moreover, seldom do the architectures that derive from such ventures comprise haptic feedback in their implementation, which in turn deprives users from relying on one of the paramount aspects of human interaction, their sense of touch. Determined to circumvent these issues, the present dissertation proposes a centralized albeit modularized framework that thus enables the conception of multimodal VR/AR applications in a novel and straightforward manner. In order to accomplish this, the aforesaid framework makes use of a stereoscopic VR Head Mounted Display (HMD) from Oculus Rift©, a hand tracking controller from Leap Motion©, a custom-made VR mount that allows for the assemblage of the two preceding peripherals and a wearable device of our own design. The latter is a glove that encompasses two core modules in its innings, one that is able to convey haptic feedback to its wearer and another that deals with the non-intrusive acquisition, processing and registering of his/her Electrocardiogram (ECG), Electromyogram (EMG) and Electrodermal Activity (EDA). The software elements of the aforementioned features were all interfaced through Unity3D©, a powerful game engine whose popularity in academic and scientific endeavors is evermore increasing. Upon completion of our system, it was time to substantiate our initial claim with thoroughly developed experiences that would attest to its worth. With this premise in mind, we devised a comprehensive repository of interfaces, amid which three merit special consideration: Brain Connectivity Leap (BCL), Ode to Passive Haptic Learning (PHL) and a Surgical Simulator
    • …
    corecore