2,071 research outputs found

    MRI Data Processing Acceleration on GPU

    Get PDF
    Tato bakalářská práce byla vypracována v průběhu studijního pobytu na Universita della Svizzera italiana ve Švýcarsku. Identifikace trajektorií neuronových vláken uvnitř lidského mozku má velký význam v mnoha lékařských aplikacích, jako neurologická diagnostika, neuro-navigace, léčba epilepsie, chirurgické operace a tak dále. Za použití dat z MRI, metod postavených na Markovských řetězích a Monte Carlu mohou být možné trajektorie vypočítany a ty nejpravděpodobnější zobrazeny. Tyto informace o trajektoriích mohou sloužit jako vstup pro pokročilé metody lékařské diagnotiky a léčby. Vzhledem k obrovskému množství dat a velkého počtu iterací toto může být časově náročný proces. Za účely, jako jsou statistická analýza a/nebo porovnávání několika datových sad a/nebo pacientů, požadavky na výpočetní čas jsou enormní. Rychlejší diagnóza může také přinést nasazení léčby dříve. Nyní existuje jen velmi málo implementací softwaru pro neurální traktografii. Implementací softwaru pro pravděpodobnostní neurální traktografii je ještě méně. Nynější implementace, provádějící všechny operace postupně na CPU, jsou značně pomalé. Účelem této práce je poskytnout efektivní implementaci, která vvyužíva GPU. Za účelem implementace na GPU, je poskytnuto porovnaní technologíí CUDA a OpenCL.This BSc Thesis was performed during a study stay at the Universita della Svizzera italiana, Swiss. The identification of trajectories of neuron fibres within the human brain is of great importance in many medical applications as the neural diagnostics, neuronavigation, treatment of epilepsy, surgical removal of tumors and etc. By using diffusion MRI-data as input, and by employing Monte-Carlo like methods, possible trajectories are generated and the most likely ones can be visualized. These can serve as input for advanced medical diagnosis and treatments. Due to the huge amount of data to be analyzed and many iterations, this is a time consuming process. For the purposes such as statistical analysis and comparsion over several datasets or several patients, computational time requirements are enourmous. Faster diagnosis can improve routine throughput and provide earlier treatment of illness. At this time, there exists only a very few implementations of neural tractography sof tware. For probabilistic neural tractography is the list of software even thiner. Today's implementations using standard serial CPU execution suffer from high time consumption. The goal is to provide an efficient implementation which makes use of GPGPUs and exploits parallelism in the method. For the GPU implementation, a comparsion of CUDA and OpenCL technologies will be provided, using the more suitable one.

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    HAEC News

    Get PDF

    DESIGN AUTOMATION FOR LOW POWER RFID TAGS

    Get PDF
    Radio Frequency Identification (RFID) tags are small, wireless devices capable of automated item identification, used in a variety of applications including supply chain management, asset management, automatic toll collection (EZ Pass), etc. However, the design of these types of custom systems using the traditional methods can take months for a hardware engineer to develop and debug. In this dissertation, an automated, low-power flow for the design of RFID tags has been developed, implemented and validated. This dissertation presents the RFID Compiler, which permits high-level design entry using a simple description of the desired primitives and their behavior in ANSI-C. The compiler has different back-ends capable of targeting microprocessor-based or custom hardware-based tags. For the hardware-based tag, the back-end automatically converts the user-supplied behavior in C to low power synthesizable VHDL optimized for RFID applications. The compiler also integrates a fast, high-level power macromodeling flow, which can be used to generate power estimates within 15% accuracy of industry CAD tools and to optimize the primitives and / or the behaviors, compared to conventional practices. Using the RFID Compiler, the user can develop the entire design in a matter of days or weeks. The compiler has been used to implement standards such as ANSI, ISO 18000-7, 18000-6C and 18185-7. The automatically generated tag designs were validated by targeting microprocessors such as the AD Chips EISC and FPGAs such as Xilinx Spartan 3. The corresponding ASIC implementation is comparable to the conventionally designed commercial tags in terms of the energy and area. Thus, the RFID Compiler permits the design of power efficient, custom RFID tags by a wider audience with a dramatically reduced design cycle

    Doctor of Philosophy

    Get PDF
    dissertationStochastic methods, dense free-form mapping, atlas construction, and total variation are examples of advanced image processing techniques which are robust but computationally demanding. These algorithms often require a large amount of computational power as well as massive memory bandwidth. These requirements used to be ful lled only by supercomputers. The development of heterogeneous parallel subsystems and computation-specialized devices such as Graphic Processing Units (GPUs) has brought the requisite power to commodity hardware, opening up opportunities for scientists to experiment and evaluate the in uence of these techniques on their research and practical applications. However, harnessing the processing power from modern hardware is challenging. The di fferences between multicore parallel processing systems and conventional models are signi ficant, often requiring algorithms and data structures to be redesigned signi ficantly for efficiency. It also demands in-depth knowledge about modern hardware architectures to optimize these implementations, sometimes on a per-architecture basis. The goal of this dissertation is to introduce a solution for this problem based on a 3D image processing framework, using high performance APIs at the core level to utilize parallel processing power of the GPUs. The design of the framework facilitates an efficient application development process, which does not require scientists to have extensive knowledge about GPU systems, and encourages them to harness this power to solve their computationally challenging problems. To present the development of this framework, four main problems are described, and the solutions are discussed and evaluated: (1) essential components of a general 3D image processing library: data structures and algorithms, as well as how to implement these building blocks on the GPU architecture for optimal performance; (2) an implementation of unbiased atlas construction algorithms|an illustration of how to solve a highly complex and computationally expensive algorithm using this framework; (3) an extension of the framework to account for geometry descriptors to solve registration challenges with large scale shape changes and high intensity-contrast di fferences; and (4) an out-of-core streaming model, which enables developers to implement multi-image processing techniques on commodity hardware
    corecore