5 research outputs found

    Gerçek zamanlı video işleme algoritmalarının uygulanması için araç ve teknikler

    Full text link
    Thesis (M.A.)--Özyeğin University, Graduate School of Sciences and Engineering, Department of Computer Science, September 2018.Hardware implementation of video processing algorithms, which are usually real-time by nature, need architectural exploration so that we achieve the required performance with minimal cost. In addition, the video algorithm to be implemented may need to be used with di erent frames-per-second and resolution in di erent applications. Hence, we usually need to design a parameterized IP block instead of a xed design. Also, during the hardware design process, the requirements fed from the algorithms team may change as well as the algorithm itself. As a result of these, hardware implementation iterations need to be as fast as the algorithms development iterations. This is only possible with the use of tools and techniques speci cally geared towards hardware design generation for video processing. The tools and techniques discussed in this dissertation include host software, FPGA interface IP, HLS, RTL generation tools, an architectural estimation tool, ow based veri cation approach, and logic synthesis automation as well as architectural concepts (e.g., nested pipelining). The architectural estimation tool estimates many design metrics. These metrics are area, throughput, latency, DRAM usage, interface bandwidth, temperature, and compilation time. While we explain the above tools and techniques within a speci c use case, namely, optical ow, we also present results from another use case, image fusion. Using our methodology and tools, we were able to design and bring up to 11 versions of optical ow and 3 versions of image fusion on 3 di erent FPGAs from 2 di erent vendors. The rst version of these designs (hence the generators) took several months; however, the subsequent design versions each took a few days with a few people. In the case where only architectural trade-o is needed, we were able to generate and synthesize around one thousand designs in a single day on a 48-core server.Video i sleme algoritmalar n n donan m ger ceklemelerinin (do gas gere gi co gunlukla ger cek zamanl ) mimari od unle sime ihtiyac vard r. B oylece asgari maliyetle gerekli performans elde edebiliriz. Ek olarak, ger ceklenecek video algoritmalar n n farkl uygulamalarda farkl fps ve c oz un url uk ile kullan lmas gerekli olabilir. Bu nedenle, genellikle sabit bir tasar m yerine, parametrize bir IP blo guna ihtiya c duyar z. Ayr ca, donan m geli stirme s urecinde, algoritma tak m taraf ndan verilen gereksinimler algoritman n kendisi gibi de gi sebilir. Bunlar n sonucu olarak, donan m geli stirme iterasyonlar , algoritma geli stirme iterasyonlar kadar h zl olmal d r. Bu, ancak ozellikle video i sleme i cin donan m tasar m uretimine y onelik ara c ve tekniklerin kullan m ile m umk und ur. Bu tezde bahsedilen ara c ve teknikler aras nda sunucu yaz l m , FPGA aray uz IP'leri, HLS, RTL uretim ara clar , bir mimari tahmin arac , ak s tabanl do grulama yakla s m ve lojik sentez otomasyonunun yan s ra mimari kavramlar ( orne gin. i c i ce boruhatt ) bulunmaktad r. Mimari tahmin arac bir cok metri gi tahmin etmektedir. Bu metrikler alan, c kt verme, gecikme, DRAM kullan m , aray uz bantgeni sli gi, s cakl k ve derleme zaman d r. Yukar daki ara c ve teknikleri, spesi k bir kullan m durumu olan optik ak s uzerinde a c klarken, ayr ca di ger bir kullan m durumu olan g or unt u f uzyonunun da sonu clar n sunmaktay z. Metodolojimizi ve ara clar m z kullanarak, optik ak s n 11 versiyonu ve g or unt u f uzyonun 3 versiyonunu 2 farkl sirketin 3 farkl FPGA'i uzerinde tasarlad k ve aya ga kald rd k. Bu tasar mlar n ilk versiyonlar (RTL ureticisi nedeniyle) birka c ay ald , ancak m uteakip tasar m versiyonlar n n her biri birka c ki si ile birka c g un ald . Mimari od unle sime ihtiya c duyuldu gu durumlarda, bin civar nda tasar m bir g un i cerisinde 48 cekirdekli sunucuda uretip sentezleyebilmekteyiz

    FPGA implementation of a dense optical flow algorithm using altera openCL SDK

    Full text link
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.FPGA acceleration of compute-intensive algorithms is usually not regarded feasible because of the long Verilog or VHDL RTL design efforts they require. Data-parallel algorithms have an alternative platform for acceleration, namely, GPU. Two languages are widely used for GPU programming, CUDA and OpenCL. OpenCL is the choice of many coders due to its portability to most multi-core CPUs and most GPUs. OpenCL SDK for FPGAs and High-Level Synthesis (HLS) in general make FPGA acceleration truly feasible. In data-parallel applications, OpenCL based synthesis is preferred over traditional HLS as it can be seamlessly targeted to both GPUs and FPGAs. This paper shares our experiences in targeting a demanding optical flow algorithm to a high-end FPGA as well as a high-end GPU using OpenCL. We offer throughput and power consumption results on both platforms.TÜBİTAK; European Union Artemi

    Using high-level synthesis for rapid design of video processing pipes

    Full text link
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.In this work, we share our experience in using High-Level Synthesis (HLS) for rapid development of an optical flow design on FPGA. We have performed HLS using Vivado HLS as well as a HLS tool we have developed for the optical flow design at hand and similar video processing problems. The paper first describes the design problem we have and then discusses our own HLS tool. The tool we developed has turned out to be pretty general-purpose except for the ability to handle cyclic inter-iteration dependencies. It also introduces some novel concepts to HLS, such as “pipelined multiplexers”. The synthesis results show that we can achieve better timing or better area results compared to Vivado HLS. Furthermore, the Verilog RTL our HLS tool outputs is much more readable than the one from Vivado HLS. This makes it much easier for the designer to debug and modify the RTL.TÜBİTAK; Artemis-JU project ALMARV

    FPGA-Based Implementation of an Underwater Quantum Key Distribution System With BB84 Protocol

    Full text link
    As threats in the maritime domain diversify, securing data transmission becomes critical for underwater wireless networks designed for the surveillance of critical infrastructure and maritime border protection. This has sparked interest in underwater Quantum Key Distribution (QKD). In this paper, we present an FPGA-based real-time implementation of an underwater QKD system based on the BB84 protocol. The QKD unit is built on a hybrid computation system consisting of an FPGA and an on-board computer (OBC) interfaced with optical front-ends. A real-time photon counting module is implemented on FPGA. The transmitter and receiver units are powered with external UPS and all system parameters can be monitored from the connected computers. The system is equipped with a visible laser and an alignment indicator to validate successful manual alignment. Secure key distribution at a rate of 100 qubits per second was successfully tested over a link distance of 7 meters
    corecore