952 research outputs found

    NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features

    Full text link
    While the GPGPU paradigm is widely recognized as an effective approach to high performance computing, its adoption in low-latency, real-time systems is still in its early stages. Although GPUs typically show deterministic behaviour in terms of latency in executing computational kernels as soon as data is available in their internal memories, assessment of real-time features of a standard GPGPU system needs careful characterization of all subsystems along data stream path. The networking subsystem results in being the most critical one in terms of absolute value and fluctuations of its response latency. Our envisioned solution to this issue is NaNet, a FPGA-based PCIe Network Interface Card (NIC) design featuring a configurable and extensible set of network channels with direct access through GPUDirect to NVIDIA Fermi/Kepler GPU memories. NaNet design currently supports both standard - GbE (1000BASE-T) and 10GbE (10Base-R) - and custom - 34~Gbps APElink and 2.5~Gbps deterministic latency KM3link - channels, but its modularity allows for a straightforward inclusion of other link technologies. To avoid host OS intervention on data stream and remove a possible source of jitter, the design includes a network/transport layer offload module with cycle-accurate, upper-bound latency, supporting UDP, KM3link Time Division Multiplexing and APElink protocols. After NaNet architecture description and its latency/bandwidth characterization for all supported links, two real world use cases will be presented: the GPU-based low level trigger for the RICH detector in the NA62 experiment at CERN and the on-/off-shore data link for KM3 underwater neutrino telescope

    High-speed data transfer with FPGAs and QSFP+ modules

    Full text link
    We present test results and characterization of a data transmission system based on a last generation FPGA and a commercial QSFP+ (Quad Small Form Pluggable +) module. QSFP+ standard defines a hot-pluggable transceiver available in copper or optical cable assemblies for an aggregated bandwidth of up to 40 Gbps. We implemented a complete testbench based on a commercial development card mounting an Altera Stratix IV FPGA with 24 serial transceivers at 8.5 Gbps, together with a custom mezzanine hosting three QSFP+ modules. We present test results and signal integrity measurements up to an aggregated bandwidth of 12 Gbps.Comment: 5 pages, 3 figures, Published on JINST Journal of Instrumentation proceedings of Topical Workshop on Electronics for Particle Physics 2010, 20-24 September 2010, Aachen, Germany(R Ammendola et al 2010 JINST 5 C12019

    High resolution infrared and Raman spectra of 13C12CD2: The CD stretching fundamentals and associated combination and hot bands

    Get PDF
    11 pags.; 4 figs.; 9 tabs.© 2015 AIP Publishing LLC. Infrared and Raman spectra of mono 13C fully deuterated acetylene, 13C12CD2, have been recorded and analysed to obtain detailed information on the C—D stretching fundamentals and associated combination, overtone, and hot bands. Infrared spectra were recorded at an instrumental resolution ranging between 0.006 and 0.01 cm−1 in the region 1800–7800 cm−1. Sixty new bands involving the Îœ 1 and Îœ 3 C—D stretching modes also associated with the Îœ 4 and Îœ 5 bending vibrations have been observed and analysed. In total, 5881 transitions have been assigned in the investigated spectral region. In addition, the Q branch of the Îœ 1 fundamental was recorded using inverse Raman spectroscopy, with an instrumental resolution of about 0.003 cm−1. The transitions relative to each stretching mode, i.e., the fundamental band, its first overtone, and associated hot and combination bands involving bending states with υ 4 + υ 5 up to 2 were fitted simultaneously. The usual Hamiltonian appropriate to a linear molecule, including vibration and rotation l-type and the Darling–Dennison interaction between υ 4 = 2 and υ 5 = 2 levels associated with the stretching states, was adopted for the analysis. The standard deviation for each global fit is ≀0.0004 cm−1, of the same order of magnitude of the measurement precision. Slightly improved parameters for the bending and the Îœ 2 manifold have been also determined. Precise values of spectroscopic parameters deperturbed from the resonance interactions have been obtained. They provide quantitative information on the anharmonic character of the potential energy surface, which can be useful, in addition to those reported in the literature, for the determination of a general anharmonic force field for the molecule. Finally, the obtained values of the Darling–Dennison constants can be valuable for understanding energy flows between independent vibrations.The Bologna authors acknowledge the UniversitĂ  di Bologna and the financial support of the Ministero dell’ Istruzione dell’UniversitĂ  e della Ricerca (PRIN 2012 “Spettroscopia e Tecniche computazionali per la ricerca Astrofisica, atmosferica e Radioastronomica”). D.B. and R.Z.M. acknowledge the financial support of the Ministry of Economy and Competitiveness through Research Grant No. FIS2012-38175.Peer Reviewe

    apeNEXT: A multi-TFlops Computer for Simulations in Lattice Gauge Theory

    Full text link
    We present the APE (Array Processor Experiment) project for the development of dedicated parallel computers for numerical simulations in lattice gauge theories. While APEmille is a production machine in today's physics simulations at various sites in Europe, a new machine, apeNEXT, is currently being developed to provide multi-Tflops computing performance. Like previous APE machines, the new supercomputer is largely custom designed and specifically optimized for simulations of Lattice QCD.Comment: Poster at the XXIII Physics in Collisions Conference (PIC03), Zeuthen, Germany, June 2003, 3 pages, Latex. PSN FRAP15. Replaced for adding forgotten autho

    Toll-like receptor kinetics in septic shock patients: a preliminary study.

    Get PDF
    The aim of this study is to evaluate some inflammatory parameter changes in septic shock patients and their possible correlation with clinical outcome, in particular when continuous veno-venous hemofiltration (CVVH) treatment is required. Considering the objective difficulty in enrolling this kind of patient, a preliminary study was initiated on seventeen septic shock patients admitted to a medical and surgical ICU. The mRNA expression of Toll-like receptor (TLR)-1, TLR-2, TLR-4, TLR-5, TLR-9, TNFalpha, IL-8 and IL-1beta was assessed, the plasmatic concentrations of IL-18, IL-2, IL-10 and TNFalpha were measured on the day of sepsis diagnosis and after 72 h. In those patients who developed acute renal failure unresponsive to medical treatment and who underwent CVVH treatment the same parameters were measured every 24 h during CVVH and after completion of the treatment. On sepsis diagnosis, gene expression of TLRs was up-regulated compared to the housekeeping gene in all the patients. After 72 h, in 35% of the patients a down-regulation of these genes was found compared to day 1, but it was not associated with a reduction of cytokine serum levels or improved clinical signs, better outcome or reduced mortality. After high volume hemofiltration treatment, cytokine serum levels and TLR expression were not significantly modified. In conclusion, considering the not numerous number of cases, from our preliminary study, we cannot certainly correlate TLR over-expression in septic shock patients with severity or outcome scores

    GPU-based Real-time Triggering in the NA62 Experiment

    Full text link
    Over the last few years the GPGPU (General-Purpose computing on Graphics Processing Units) paradigm represented a remarkable development in the world of computing. Computing for High-Energy Physics is no exception: several works have demonstrated the effectiveness of the integration of GPU-based systems in high level trigger of different experiments. On the other hand the use of GPUs in the low level trigger systems, characterized by stringent real-time constraints, such as tight time budget and high throughput, poses several challenges. In this paper we focus on the low level trigger in the CERN NA62 experiment, investigating the use of real-time computing on GPUs in this synchronous system. Our approach aimed at harvesting the GPU computing power to build in real-time refined physics-related trigger primitives for the RICH detector, as the the knowledge of Cerenkov rings parameters allows to build stringent conditions for data selection at trigger level. Latencies of all components of the trigger chain have been analyzed, pointing out that networking is the most critical one. To keep the latency of data transfer task under control, we devised NaNet, an FPGA-based PCIe Network Interface Card (NIC) with GPUDirect capabilities. For the processing task, we developed specific multiple ring trigger algorithms to leverage the parallel architecture of GPUs and increase the processing throughput to keep up with the high event rate. Results obtained during the first months of 2016 NA62 run are presented and discussed

    APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters

    Full text link
    We describe herein the APElink+ board, a PCIe interconnect adapter featuring the latest advances in wire speed and interface technology plus hardware support for a RDMA programming model and experimental acceleration of GPU networking; this design allows us to build a low latency, high bandwidth PC cluster, the APEnet+ network, the new generation of our cost-effective, tens-of-thousands-scalable cluster network architecture. Some test results and characterization of data transmission of a complete testbench, based on a commercial development card mounting an Altera FPGA, are provided.Comment: 6 pages, 7 figures, proceeding of CHEP 2010, Taiwan, October 18-2

    Progress and status of APEmille

    Get PDF
    We report on the progress and status of the APEmille project: a SIMD parallel computer with a peak performance in the TeraFlops range which is now in an advanced development phase. We discuss the hardware and software architecture, and present some performance estimates for Lattice Gauge Theory (LGT) applications.Comment: Talk presented at LATTICE97, 3 pages, Late

    A simple method for generating full length cDNA from low abundance partial genomic clones

    Get PDF
    BACKGROUND: PCR amplification of target molecules involves sequence specific primers that flank the region to be amplified. While this technique is generally routine, its applicability may not be sufficient to generate a desired target molecule from two separate regions involving intron /exon boundaries. For these situations, the generation of full-length complementary DNAs from two partial genomic clones becomes necessary for the family of low abundance genes. RESULTS: The first approach we used for the isolation of full-length cDNA from two known genomic clones of Hox genes was based on fusion PCR. Here we describe a simple and efficient method of amplification for homeobox D13 (HOXD13) full length cDNA from two partial genomic clones. Specific 5' and 3' untranslated region (UTR) primer pairs and website program (primer3_www.cgv0.2) were key steps involved in this process. CONCLUSIONS: We have devised a simple, rapid and easy method for generating cDNA clone from genomic sequences. The full length HOXD13 clone (1.1 kb) generated with this technique was confirmed by sequence analysis. This simple approach can be utilized to generate full-length cDNA clones from available partial genomic sequences
    • 

    corecore