55 research outputs found

    On-chip hullámfront érzékelés és processzálás parallel implementációja = Parallel implementation of on-chip wavefront sensing and processing

    Get PDF
    Turbulens közeg által okozott dinamikus aberrációk méréséhez hullámfront szenzorokra van szükség, melyekben találkozik a fejlett fényérzékelő tömbök technológiája és a nagysebességű valós idejű feldolgozás, amit párhuzamos, szenzor közeli eszközökkel lehet teljesíteni. Három módszert vizsgáltunk: 1. Analóg és digitális processzálással is rendelkező, celluláris architektúrát használó, párhuzamos processzálásra képes - programozható szenzorhoz (Eye-Ris) illesztettünk egy lencse mátrixot (Shack-Hartmann szenzor, SH). Erre eszközre korrelációs hullámfront mérő algoritmust készítettünk. Ugyan az eszközön elérhető felbontás nem túl nagy, de magán a chip-en párhuzamosan futó algoritmus következtében igen nagy sebesség érhető el. 2. A speciális, nagysebességű és felbontású CMOS szenzort (SH) egy nagy teljesítményű FPGA eszközzel egészítettük ki, ami a szenzor kontrol funkcióján kívül a szükséges processzálásokat is elvégzi nagy párhuzamossággal. Az eszköz hullámfront korrigáló egysége egy nagysebességű LCOS mikro-megjelenítő, aminek a kontrol funkcióit is a beépített FPGA látja el. Ílymódon sokkal nagyobb sebesség érhető el, mint más hasonló, ma hozzáférhető eszközzel. 3. A térbeli fázistolás-interferometrián alapuló (FINCH) eljárás lényege, hogy a hullámfrontról egyszerre több önmagával képzett, nulla úthossz különbségű interferogrammot veszünk fel különböző fázistolásokkal (0, pi/2, pi, 3pi/2). Ekkor a hullámfront fázisa egyszerűen (elemi algebrai műveletekkel) meghatározható. | To measure and compensate dynamic aberrations caused by turbulent media wavefront sensors using high end area scan technology and high speed, parallel, close-to-sensor, real time processing are applied. We have developed three methods: 1. We applied a new, mixed mode cellular array sensor-computer device, which combines analog and digital parallel processing capabilities with on chip sensors (Eye-Ris). We attached to the sensory elements a lens array (Shack-Hartmann sensor, SH) and developed an appropriate, parallel, on chip correlation based wavefront sensor algorithm. Although the achievable resolution is constrained, but due to its analog, parallel processing capabilities very high speed was accessible. 2. A high speed, high resolution CMOS sensor (SH) is combined with a FPGA, which in addition to the sensor control function can fulfill the required processing with high parallelism. This device has a wavefront corrector unit, an LCOS micro display, which in accordance with the measured wavefront distortions can compensate it at a very high speed. This device outperforms any contemporary counterparts in resolution, speed, noise and price. 3. A fast wavefront sensor method based on spatial phase shifting interferometry (FINCH), applying measurements of several zero path length difference interferograms with different phase shifts (0, p/2, p, 3p/2). From these smaller resolution interferograms the wavefront can be reconstructed using only elementary algebra

    NASA Tech Briefs, September 2011

    Get PDF
    Topics covered include: Fused Reality for Enhanced Flight Test Capabilities; Thermography to Inspect Insulation of Large Cryogenic Tanks; Crush Test Abuse Stand; Test Generator for MATLAB Simulations; Dynamic Monitoring of Cleanroom Fallout Using an Air Particle Counter; Enhancement to Non-Contacting Stress Measurement of Blade Vibration Frequency; Positively Verifying Mating of Previously Unverifiable Flight Connectors; Radiation-Tolerant Intelligent Memory Stack - RTIMS; Ultra-Low-Dropout Linear Regulator; Excitation of a Parallel Plate Waveguide by an Array of Rectangular Waveguides; FPGA for Power Control of MSL Avionics; UAVSAR Active Electronically Scanned Array; Lockout/Tagout (LOTO) Simulator; Silicon Carbide Mounts for Fabry-Perot Interferometers; Measuring the In-Process Figure, Final Prescription, and System Alignment of Large; Optics and Segmented Mirrors Using Lidar Metrology; Fiber-Reinforced Reactive Nano-Epoxy Composites; Polymerization Initiated at the Sidewalls of Carbon Nanotubes; Metal-Matrix/Hollow-Ceramic-Sphere Composites; Piezoelectrically Enhanced Photocathodes; Iridium-Doped Ruthenium Oxide Catalyst for Oxygen Evolution; Improved Mo-Re VPS Alloys for High-Temperature Uses; Data Service Provider Cost Estimation Tool; Hybrid Power Management-Based Vehicle Architecture; Force Limit System; Levitated Duct Fan (LDF) Aircraft Auxiliary Generator; Compact, Two-Sided Structural Cold Plate Configuration; AN Fitting Reconditioning Tool; Active Response Gravity Offload System; Method and Apparatus for Forming Nanodroplets; Rapid Detection of the Varicella Zoster Virus in Saliva; Improved Devices for Collecting Sweat for Chemical Analysis; Phase-Controlled Magnetic Mirror for Wavefront Correction; and Frame-Transfer Gating Raman Spectroscopy for Time-Resolved Multiscalar Combustion Diagnostics

    Subpixel real-time jitter detection algorithm and implementation for polarimetric and helioseismic imager

    Get PDF
    The polarimetric and helioseismic imager instrument for the Solar Orbiter mission from the European Space Agency requires a high stability while capturing images, specially for the polarimetric ones. For this reason, an image stabilization system has been included in the instrument. It uses global motion estimation techniques to estimate the jitter in real time with subpixel resolution. Due to instrument requirements, the algorithm has to be implemented in a Xilinx Virtex-4QV field programmable gate array. The algorithm includes a 2-D paraboloid interpolation algorithm based on 2-D bisection. We describe the algorithm implementation and the tests that have been made to verify its performance. The jitter estimation has a mean error of 125  pixel of the correlation tracking camera. The paraboloid interpolation algorithm provides also better results in terms of resources and time required for the calculation (at least a 20% improvement in both cases) than those based on direct calculation

    Feasibility Study of High-Level Synthesis : Implementation of a Real-Time HEVC Intra Encoder on FPGA

    Get PDF
    High-Level Synthesis (HLS) on automatisoitu suunnitteluprosessi, joka pyrkii parantamaan tuottavuutta perinteisiin suunnittelumenetelmiin verrattuna, nostamalla suunnittelun abstraktiota rekisterisiirtotasolta (RTL) käyttäytymistasolle. Erilaisia kaupallisia HLS-työkaluja on ollut markkinoilla aina 1990-luvulta lähtien, mutta vasta äskettäin ne ovat alkaneet saada hyväksyntää teollisuudessa sekä akateemisessa maailmassa. Hidas käyttöönottoaste on johtunut pääasiassa huonommasta tulosten laadusta (QoR) kuin mitä on ollut mahdollista tavanomaisilla laitteistokuvauskielillä (HDL). Uusimmat HLS-työkalusukupolvet ovat kuitenkin kaventaneet QoR-aukkoa huomattavasti. Tämä väitöskirja tutkii HLS:n soveltuvuutta videokoodekkien kehittämiseen. Se esittelee useita HLS-toteutuksia High Efficiency Video Coding (HEVC) -koodaukselle, joka on keskeinen mahdollistava tekniikka lukuisille nykyaikaisille mediasovelluksille. HEVC kaksinkertaistaa koodaustehokkuuden edeltäjäänsä Advanced Video Coding (AVC) -standardiin verrattuna, saavuttaen silti saman subjektiivisen visuaalisen laadun. Tämä tyypillisesti saavutetaan huomattavalla laskennallisella lisäkustannuksella. Siksi reaaliaikainen HEVC vaatii automatisoituja suunnittelumenetelmiä, joita voidaan käyttää rautatoteutus- (HW ) ja varmennustyön minimoimiseen. Tässä väitöskirjassa ehdotetaan HLS:n käyttöä koko enkooderin suunnitteluprosessissa. Dataintensiivisistä koodaustyökaluista, kuten intra-ennustus ja diskreetit muunnokset, myös enemmän kontrollia vaativiin kokonaisuuksiin, kuten entropiakoodaukseen. Avoimen lähdekoodin Kvazaar HEVC -enkooderin C-lähdekoodia hyödynnetään tässä työssä referenssinä HLS-suunnittelulle sekä toteutuksen varmentamisessa. Suorituskykytulokset saadaan ja raportoidaan ohjelmoitavalla porttimatriisilla (FPGA). Tämän väitöskirjan tärkein tuotos on HEVC intra enkooderin prototyyppi. Prototyyppi koostuu Nokia AirFrame Cloud Server palvelimesta, varustettuna kahdella 2.4 GHz:n 14-ytiminen Intel Xeon prosessorilla, sekä kahdesta Intel Arria 10 GX FPGA kiihdytinkortista, jotka voidaan kytkeä serveriin käyttäen joko peripheral component interconnect express (PCIe) liitäntää tai 40 gigabitin Ethernettiä. Prototyyppijärjestelmä saavuttaa reaaliaikaisen 4K enkoodausnopeuden, jopa 120 kuvaa sekunnissa. Lisäksi järjestelmän suorituskykyä on helppo skaalata paremmaksi lisäämällä järjestelmään käytännössä minkä tahansa määrän verkkoon kytkettäviä FPGA-kortteja. Monimutkaisen HEVC:n tehokas mallinnus ja sen monipuolisten ominaisuuksien mukauttaminen reaaliaikaiselle HW HEVC enkooderille ei ole triviaali tehtävä, koska HW-toteutukset ovat perinteisesti erittäin aikaa vieviä. Tämä väitöskirja osoittaa, että HLS:n avulla pystytään nopeuttamaan kehitysaikaa, tarjoamaan ennen näkemätöntä suunnittelun skaalautuvuutta, ja silti osoittamaan kilpailukykyisiä QoR-arvoja ja absoluuttista suorituskykyä verrattuna olemassa oleviin toteutuksiin.High-Level Synthesis (HLS) is an automated design process that seeks to improve productivity over traditional design methods by increasing design abstraction from register transfer level (RTL) to behavioural level. Various commercial HLS tools have been available on the market since the 1990s, but only recently they have started to gain adoption across industry and academia. The slow adoption rate has mainly stemmed from lower quality of results (QoR) than obtained with conventional hardware description languages (HDLs). However, the latest HLS tool generations have substantially narrowed the QoR gap. This thesis studies the feasibility of HLS in video codec development. It introduces several HLS implementations for High Efficiency Video Coding (HEVC) , that is the key enabling technology for numerous modern media applications. HEVC doubles the coding efficiency over its predecessor Advanced Video Coding (AVC) standard for the same subjective visual quality, but typically at the cost of considerably higher computational complexity. Therefore, real-time HEVC calls for automated design methodologies that can be used to minimize the HW implementation and verification effort. This thesis proposes to use HLS throughout the whole encoder design process. From data-intensive coding tools, like intra prediction and discrete transforms, to more control-oriented tools, such as entropy coding. The C source code of the open-source Kvazaar HEVC encoder serves as a design entry point for the HLS flow, and it is also utilized in design verification. The performance results are gathered with and reported for field programmable gate array (FPGA) . The main contribution of this thesis is an HEVC intra encoder prototype that is built on a Nokia AirFrame Cloud Server equipped with 2.4 GHz dual 14-core Intel Xeon processors and two Intel Arria 10 GX FPGA Development Kits, that can be connected to the server via peripheral component interconnect express (PCIe) generation 3 or 40 Gigabit Ethernet. The proof-of-concept system achieves real-time. 4K coding speed up to 120 fps, which can be further scaled up by adding practically any number of network-connected FPGA cards. Overcoming the complexity of HEVC and customizing its rich features for a real-time HEVC encoder implementation on hardware is not a trivial task, as hardware development has traditionally turned out to be very time-consuming. This thesis shows that HLS is able to boost the development time, provide previously unseen design scalability, and still result in competitive performance and QoR over state-of-the-art hardware implementations

    Towards quantitative high-throughput 3D localization microscopy

    Get PDF
    Advances in light microscopy have allowed circumventing the diffraction barrier, once thought to be the ultimate resolution limit in optical microscopy, and given rise to various superresolution microscopy techniques. Among them, localization microscopy exploits the blinking of fluorescent molecules to precisely pinpoint the positions of many emitters individually, and subsequently reconstruct a superresolved image from these positions. While localization microscopy enables the study of cellular structures and protein complexes with unprecedented details, severe technical bottlenecks still reduce the scope of possible applications. In my PhD work, I developed several technical improvements at the level of the microscope to overcome limitations related to the photophysical behaviour of fluorescent molecules, slow acquisition rates and three-dimensional imaging. I built an illumination system that achieves uniform intensity across the field-of view using a multi-mode fiber and a commercial speckle-reducer. I showed that it provides uniform photophysics within the illuminated area and is far superior to the common illumination system. It is easy to build and to add to any microscope, and thus greatly facilitates quantitative approaches in localization microscopy. Furthermore, I developed a fully automated superresolution microscope using an open-source software framework. I developed advanced electronics and user friendly software solutions to enable the design and unsupervised acquisition of complex experimental series. Optimized for long-term stability, the automated microscope is able to image hundreds to thousands of regions over the course of days to weeks. First applied in a system-wide study of clathrin-mediated endocytosis in yeast, the automated microscope allowed the collection of a data set of a size and scope unprecedented in localization microscopy. Finally, I established a fundamentally new approach to obtain three-dimensional superresolution images. Supercritical angle localization microscopy (SALM) exploits the phenomenon of surface-generated fluorescence arising from fluorophores close to the coverslip. SALM has the theoretical prospect of an isotropic spatial resolution with simple instrumentation. Following a first proof-of-concept implementation, I re-engineered the microscope to include adaptive optics in order to reach the full potential of the method. Taken together, I established simple, yet powerful, solutions for three fundamental technical limitations in localization microscopy regarding illumination, throughput and resolution. All of them can be combined within the same instrument, and can dramatically improve every cutting-edge microscope. This will help to push the limit of the most challenging applications of localization microscopy, including system-wide imaging experiments and structural studies

    High Performance Multiview Video Coding

    Get PDF
    Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications

    Tuning the Computational Effort: An Adaptive Accuracy-aware Approach Across System Layers

    Get PDF
    This thesis introduces a novel methodology to realize accuracy-aware systems, which will help designers integrate accuracy awareness into their systems. It proposes an adaptive accuracy-aware approach across system layers that addresses current challenges in that domain, combining and tuning accuracy-aware methods on different system layers. To widen the scope of accuracy-aware computing including approximate computing for other domains, this thesis presents innovative accuracy-aware methods and techniques for different system layers. The required tuning of the accuracy-aware methods is integrated into a configuration layer that tunes the available knobs of the accuracy-aware methods integrated into a system

    The Department of Electrical and Computer Engineering Newsletter

    Get PDF
    Spring 2012 News and notes for University of Dayton\u27s Department of Electrical and Computer Engineering.https://ecommons.udayton.edu/ece_newsletter/1002/thumbnail.jp

    The standard plenoptic camera: applications of a geometrical light field model

    Get PDF
    A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyThe plenoptic camera is an emerging technology in computer vision able to capture a light field image from a single exposure which allows a computational change of the perspective view just as the optical focus, known as refocusing. Until now there was no general method to pinpoint object planes that have been brought to focus or stereo baselines of perspective views posed by a plenoptic camera. Previous research has presented simplified ray models to prove the concept of refocusing and to enhance image and depth map qualities, but lacked promising distance estimates and an efficient refocusing hardware implementation. In this thesis, a pair of light rays is treated as a system of linear functions whose solution yields ray intersections indicating distances to refocused object planes or positions of virtual cameras that project perspective views. A refocusing image synthesis is derived from the proposed ray model and further developed to an array of switch-controlled semi-systolic FIR convolution filters. Their real-time performance is verified through simulation and implementation by means of an FPGA using VHDL programming. A series of experiments is carried out with different lenses and focus settings, where prediction results are compared with those of a real ray simulation tool and processed light field photographs for which a blur metric has been considered. Predictions accurately match measurements in light field photographs and signify deviations of less than 0.35 % in real ray simulation. A benchmark assessment of the proposed refocusing hardware implementation suggests a computation time speed-up of 99.91 % in comparison with a state-of-the-art technique. It is expected that this research supports in the prototyping stage of plenoptic cameras and microscopes as it helps specifying depth sampling planes, thus localising objects and provides a power-efficient refocusing hardware design for full-video applications as in broadcasting or motion picture arts

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations
    corecore