274 research outputs found

    Parallel waveform extraction algorithms for the Cherenkov Telescope Array Real-Time Analysis

    Get PDF
    The Cherenkov Telescope Array (CTA) is the next generation observatory for the study of very high-energy gamma rays from about 20 GeV up to 300 TeV. Thanks to the large effective area and field of view, the CTA observatory will be characterized by an unprecedented sensitivity to transient flaring gamma-ray phenomena compared to both current ground (e.g. MAGIC, VERITAS, H.E.S.S.) and space (e.g. Fermi) gamma-ray telescopes. In order to trigger the astrophysics community for follow-up observations, or being able to quickly respond to external science alerts, a fast analysis pipeline is crucial. This will be accomplished by means of a Real-Time Analysis (RTA) pipeline, a fast and automated science alert trigger system, becoming a key system of the CTA observatory. Among the CTA design key requirements to the RTA system, the most challenging is the generation of alerts within 30 seconds from the last acquired event, while obtaining a flux sensitivity not worse than the one of the final analysis by more than a factor of 3. A dedicated software and hardware architecture for the RTA pipeline must be designed and tested. We present comparison of OpenCL solutions using different kind of devices like CPUs, Graphical Processing Unit (GPU) and Field Programmable Array (FPGA) cards for the Real-Time data reduction of the Cherenkov Telescope Array (CTA) triggered data.Comment: In Proceedings of the 34th International Cosmic Ray Conference (ICRC2015), The Hague, The Netherlands. All CTA contributions at arXiv:1508.0589

    Real-Time Hand Shape Classification

    Full text link
    The problem of hand shape classification is challenging since a hand is characterized by a large number of degrees of freedom. Numerous shape descriptors have been proposed and applied over the years to estimate and classify hand poses in reasonable time. In this paper we discuss our parallel framework for real-time hand shape classification applicable in real-time applications. We show how the number of gallery images influences the classification accuracy and execution time of the parallel algorithm. We present the speedup and efficiency analyses that prove the efficacy of the parallel implementation. Noteworthy, different methods can be used at each step of our parallel framework. Here, we combine the shape contexts with the appearance-based techniques to enhance the robustness of the algorithm and to increase the classification score. An extensive experimental study proves the superiority of the proposed approach over existing state-of-the-art methods.Comment: 11 page

    A Cascade Neural Network Architecture investigating Surface Plasmon Polaritons propagation for thin metals in OpenMP

    Full text link
    Surface plasmon polaritons (SPPs) confined along metal-dielectric interface have attracted a relevant interest in the area of ultracompact photonic circuits, photovoltaic devices and other applications due to their strong field confinement and enhancement. This paper investigates a novel cascade neural network (NN) architecture to find the dependance of metal thickness on the SPP propagation. Additionally, a novel training procedure for the proposed cascade NN has been developed using an OpenMP-based framework, thus greatly reducing training time. The performed experiments confirm the effectiveness of the proposed NN architecture for the problem at hand

    Sentetik açıklıklı radar görüntülerinde alan tabanlı hedef tespiti ve paralel gerçekleştirmesi (Region based target detection in synthetic aperture radar images and its parallel implementation)

    Get PDF
    Sentetik açıklıklı radar (SAR) görüntülerinde otomatik hedef tespiti yöntemleri görüntünün çözünürlüğüne, hedefin büyüklüğüne, parazit yankı karmaşıklığına ve benek gürültü seviyesine duyarlıdır. Gürbüz bir hedef tespiti yönteminin ise bu tür etkenlere daha az duyarlı olması istenir. Önerilen yöntem görüntünün öznitelik korumalı benek gürültü arındırma (feature preserving despeckling, FPD) yönteminden geçmiş hali üzerinden olası hedef bölgelerinin ve etrafındaki parazit yankı karmaşıklığının bulunması ve sabit yanlış alarm oranı elde edilecek şekilde eşiklenmesi esasına dayanmaktadır. Hesaplama verimliği OpenMP ve NVidia CUDA kullanılarak arttırılmış ve elde edilen hızlanmalar gösterilmiştir

    Performance enhancement of an immersed boundary method based FSI solver using OpenMP

    Get PDF
    This work presents a high-fidelity in-house Fluid Structure Interaction (FSI) solver devel- oped by combining discrete forcing Immersed Boundary Method (IBM) with a RK-4 based structural solver. Classification of the grid points as fluid, solid and IB points in the IBM framework and the solution of the pressure correction equations are the two most computationally expensive section in the numerical solver. These computational efforts can be significantly reduced by implementing OpenMP techniques. However, the successive over-relaxation (SOR) iterative method used in the serial code is not suitable for OpenMP parallelization as it shows data dependencies from previous iterations. Therefore, the Red-Black (RB) SOR is implemented to avoid the data dependencies

    Эффективная реализация ЕМ-алгоритма с использованием технологии GPGPU

    Get PDF
    У статті розглядається модифікація алгоритму максимізації математичного сподівання (ЕМ-алгоритму) для підвищення його швидкодії за допомогою збільшення ступеня паралелізму при реалізації на графічному процесорі. Результат забезпечується розв’язанням класичної задачі розділення суміші гауссових випадкових величин. Реалізація алгоритму була виконана на одному і двох 8-ядерних процесорах, а також на графічному процесорі загального призначення. У всіх тестах графічний процесор за рахунок своїх значних можливостей з паралельних обчислень та через властивості виконуваного ЕМ-алгоритму виявився більш ефективним. А за великих обсягів вибірок (від 5 млн значень і більше) модифікований ЕМ-алгоритм на графічному процесорі показав практично в два рази швидше виконання, ніж на одному або двох універсальних процесорах. З урахуванням меншої вартості графічних процесорів підвищення паралелізму алгоритмів має важливе практичне значення.The problem of decreasing of running time for the data processing algorithms is very important especially when they are used in real time. For example, in real time image processing, process control systems, speech recognition, etc. The paper considers the possibility of decreasing running time of the expectation maximization (EM) algorithm using modern computing systems. The proposed modified EM-algorithm is aimed at better parallelism for the general purpose graphical processing unit (GPGPU).The experimental results are obtained with solving of the classical problem of Gaussian random variables mixture separation. The proposed implementation of the algorithm was performed on one and two 8-core processor (CPU) setup, as well as on the general purpose graphical processing unit. The graphics processor, because of its abilities for parallel computations and due to the properties of the EM-algorithm considered, showed substantially higher effectiveness in all the computational experiments. Besides, the modified EM-algorithm showed almost two times faster performance on GPGPU than on one or two CPU using large sample sizes (from 5 million values and higher). The lower price of graphics processor is an additional advantage of the approach proposed for such parallel algorithms and GPGPU usage.В статье рассматривается модификация алгоритма максимизации математического ожидания (ЕМ-алгоритма) для повышения его быстродействия за счет увеличения степени параллелизма при реализации на графическом процессоре. Результат обеспечивается решением классической задачи разделения смеси гауссовых случайных величин. Реализация алгоритма была выполнена на одном и двух 8-ядерных процессорах, а также на графическом процессоре общего назначения. Во всех тестах графический процессор за счет своих широких возможностей по параллельным вычислениям и за счет свойств исполняемого ЕМ-алгоритма оказался более эффективным. А при больших объемах выборок (от 5 млн значений и более) модифицированный ЕМ-алгоритм на графическом процессоре показал выполнение практически в два раза быстрее, чем на одном или двух универсальных. С учетом более низкой стоимости графических процессоров повышение параллелизма алгоритмов имеет важное практическое значение
    corecore