Search CORE

41 research outputs found

EXTRA: Towards the exploitation of eXascale technology for reconfigurable architectures

Author: Al Kadi M
Becker T
Brokalakis A
Charitopoulos G
Ciobanu CB
Gaydadjiev G
Huebner M
Kulkarni A
Luk W
Nikitakis A
Niu X
Pnevmatikatos D
Santambrogio MD
Sciuto D
Stroobandt D
Thom AJW
Todman T
Vansteenkiste E
Varbanescu AL
Publication venue: 2016 11th International Symposium on Reconfigurable Communication-Centric Systems-on-Chip, ReCoSoC 2016
Publication date: 01/01/2016
Field of study

© 2016 IEEE. To handle the stringent performance requirements of future exascale-class applications, High Performance Computing (HPC) systems need ultra-efficient heterogeneous compute nodes. To reduce power and increase performance, such compute nodes will require hardware accelerators with a high degree of specialization. Ideally, dynamic reconfiguration will be an intrinsic feature, so that specific HPC application features can be optimally accelerated, even if they regularly change over time. In the EXTRA project, we create a new and flexible exploration platform for developing reconfigurable architectures, design tools and HPC applications with run-time reconfiguration built-in as a core fundamental feature instead of an add-on. EXTRA covers the entire stack from architecture up to the application, focusing on the fundamental building blocks for run-time reconfigurable exascale HPC systems: new chip architectures with very low reconfiguration overhead, new tools that truly take reconfiguration as a central design concept, and applications that are tuned to maximally benefit from the proposed run-time reconfiguration techniques. Ultimately, this open platform will improve Europe's competitive advantage and leadership in the field

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Ghent University Academic Bibliography

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

Meta-analysis of radiofrequency ablation versus hepatic resection for small hepatocellular carcinoma

Author: A Guglielmi
A Hiraoka
AS Befeler
Bin Li
C Bouza
C Kawamoto
CM Cho
Donghui Xu
E Adachi
Feng Xie
FY Yao
G Pelletier
J Bruix
Jiamei Yang
JM Llovet
K Hasegawa
K Hasegawa
K Ikeda
L Lupo
M Abu-Hilal
M Danila
M Kudo
M Montorsi
M Ogihara
M Vivarelli
MD Lü
MS Chen
P Mathurin
R Santambrogio
RE Schwarz
S Ueno
SN Hong
T Wakai
TI Huo
WY Lau
XD Zhou
Y Okuwaki
Yanfang Zhao
Yanming Zhou
Zhengfeng Yin
Publication venue: BioMed Central
Publication date: 01/07/2010
Field of study

Abstract Background There is no clear consensus on the better therapy [radiofrequency ablation (RFA) versus hepatic resection (HR)] for small hepatocellular carcinoma (HCC) eligible for surgical treatments. This study is a meta-analysis of the available evidence. Methods Systematic review and meta-analysis of trials comparing RFA with HR for small HCC published from 1997 to 2009 in PubMed and Medline. Pooled odds ratios (OR) with 95% confidence intervals (95% CI) were calculated using either the fixed effects model or random effects model. Results One randomized controlled trial, and 9 nonrandomized controlled trials studies were included in this analysis. These studies included a total of 1411 patients: 744 treated with RFA and 667 treated with HR. The overall survival was significantly higher in patients treated with HR than in those treated with RFA at 3 years (OR: 0.56, 95% CI: 0.44-0.71), and at 5 year (OR: 0.60, 95% CI: 0.36-1.01). RFA has a higher rates of local intrahepatic recurrence compared to HR (OR: 4.50, 95% CI: 2.45-8.27). In the HR group the 1, 3, and 5 years disease -free survival rates were significantly better than in the HR-treated patients (respectively: OR: 0.54, 95% CI: 0.35-0.84; OR: 0.44, 95% CI: 0.28-0.68; OR: 0.64, 95% CI: 0.42-0.99). The postoperative morbidity was higher with HR (OR: 0.29, 95% CI: 0.13-0.65), but no significant differences were found concerning mortality. For tumors ≤ 3 cm HR did not differ significantly from RFA for survival, as reported in three NRCTs . Conclusions HR was superior to RFA in the treatment of patients with small HCC eligible for surgical treatments, particularly for tumors > 3 cm. However, the findings have to be carefully interpreted due to the lower level of evidence.</p

Crossref

Directory of Open Access Journals

PubMed Central

Xiamen University Institutional Repository

The RALCoach: a virtual coach technology for recreational runners

Author: Galli A
Ghisio F
Ginestretti L
Salaris M
Santambrogio MD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Running with a coach or a set goal has been shown to improve performance and training effectiveness. However, these are not viable solutions for most recreational runners. The RALCoach is designed as a system that users wear while running, providing a music pacesetter and constant visual feedback about their performance. It assists runners in achieving their exercise goals by monitoring their pace and playing music with specific features selected to nudge them to either speed up or slow down to match their specified goal. Early results have been auspicious: RALCoach enabled 12 out of 15 recreational runners to improve their performance, and, even more importantly, 13 out of 15 users declared they felt better while training

Archivio istituzionale della ricerca - Politecnico di Milano

Hephaestus: Codesigning and Automating 3D Image Registration on Reconfigurable Architectures

Author: Conficconi D
D'Arnese E
Santambrogio MD
Sorrentino G
Venere M
Publication venue
Publication date: 01/01/2023
Field of study

Healthcare is a pivotal research field, and medical imaging is crucial in many applications. Therefore finding newarchitectural and algorithmic solutionswould benefit highly repetitive image processing procedures. One of the most complex tasks in this sense is image registration, which finds the optimal geometric alignment among 3D image stacks and is widely employed in healthcare and robotics. Given the high computational demand of such a procedure, hardware accelerators are promising real-time and energy-efficient solutions, but they are complex to design and integrate within software pipelines. Therefore, this work presents an automation framework called Hephaestus that generates efficient 3D image registration pipelines combined with reconfigurable accelerators. Moreover, to alleviate the burden from the software, we codesign softwareprogrammable accelerators that can adapt at run-time to the image volume dimensions. Hephaestus features a cross-platform abstraction layer that enables transparently high-performance and embedded systems deployment. However, given the computational complexity of 3D image registration, the embedded devices become a relevant and complex setting being constrained in memory; thus, they require further attention and tailoring of the accelerators and registration application to reach satisfactory results. Therefore, with Hephaestus, we also propose an approximation mechanism that enables such devices to perform the 3D image registration and even achieve, in some cases, the accuracy of the high-performance ones. Overall, Hephaestus demonstrates 1.85x of maximum speedup, 2.35x of efficiency improvement with respect to the State of the Art, a maximum speedup of 2.51x and 2.76x efficiency improvements against our software, while attaining state-of-the-art accuracy on 3D registrations

Archivio istituzionale della ricerca - Politecnico di Milano

Software Implementation and Hardware Acceleration of Retinal Vessel Segmentation for Diabetic Retinopathy Screening Tests

Author: Bacis M
Cavinato Lara
Del Sozzo E
Durelli Gc
Fidone I
Santambrogio Md
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Screening tests are an effective tool for the diagnosis and prevention of several diseases. Unfortunately, in order to produce an early diagnosis, the huge number of collected samples has to be processed faster than before. In particular this issue concerns image processing procedures, as they require a high computational complexity, which is not satisfied by modern software architectures. To this end, Field Programmable Gate Arrays (FPGAs) can be used to accelerate partially or entirely the computation. In this work, we demonstrate that the use of FPGAs is suitable for biomedical application, by proposing a case of study concerning the implementation of a vessels segmentation algorithm. The experimental results, computed on DRIVE and STARE databases, show remarkable improvements in terms of both execution time and power efficiency (6X and 5.7X respectively) compared to the software implementation. On the other hand, the proposed hardware approach outperforms literature works (3X speedup) without affecting the overall accuracy and sensitivity measures

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Recommended from our members

A Comprehensive Methodology to Optimize FPGA Designs via the Roofline Model

Author: Del Sozzo E
Di Tucci L
Rabozzi M
Santambrogio MD
Sciuto D
Siracusa M
Williams S
Publication venue: eScholarship, University of California
Publication date: 01/08/2022
Field of study

With reconfigurable fabrics delivering increasing performance over the years, Field-Programmable Gate Arrays (FPGAs) are becoming an appealing solution for next-generation High-Performance Computing (HPC) systems. However, in order to gain traction among traditional von Neumann architectures, the optimization process of Field-Programmable Gate Array (FPGA) designs should be further abstracted to a higher level. In fact, while High-Level Synthesis (HLS) already provides a handy way to write FPGA code with common high-level languages, substantial effort and expertise are still required to optimize the resulting FPGA design for the underlying hardware. To overcome this problem, we propose a semi-automated performance optimization methodology based on a Hierarchical Roofline model for FPGAs. System-wide and applications-specific optimizations such as off-chip memory transfer and data locality optimizations are guided by the FPGA Roofline model whereas FPGA-specific optimizations are automatically searched by a Design Space Exploration (DSE) engine. We demonstrate the way this methodology allows to easily analyze and optimize to peak system performance a wide set of applications ranging from particle methods, wavefront algorithms, and sparse arithmetic computations. In addition, we prove that the integrated Design Space Exploration (DSE) engine achieves a 14.36x maximum speedup if compared to previous automated solutions in the literature

eScholarship - University of California

Recommended from our members

GPU accelerated partial order multiple sequence alignment for long reads self-correction

Author: Buluc A
Ding N
Hofmeyr S
Oliker L
Peverelli F
Santambrogio MD
Tucci LD
Yelick K
Publication venue: eScholarship, University of California
Publication date: 01/05/2020
Field of study

As third generation sequencing technologies become more reliable and widely used to solve several genome-related problems, self-correction of long reads is becoming the preferred method to reduce the error rate of Pacific Biosciences and Oxford Nanopore long reads, that is now around 10-12%. Several of these self-correction methods rely on some form of Multiple Sequence Alignment (MSA) to obtain a consensus sequence for the original reads. In particular, error-correction tools such as RACON and CONSENT use Partial Order (PO) graph alignment to accomplish this task. PO graph alignment, which is computationally more expensive than optimal global pairwise alignment between two sequences, needs to be performed several times for each read during the error correction process. GPUs have proven very effective in accelerating several computeintensive tasks in different scientific fields. We harnessed the power of these architectures to accelerate the error correction process of existing self-correction tools, to improve the efficiency of this step of genome analysis.In this paper, we introduce a GPU-accelerated version of the PO alignment presented in the POA v2 software library, implemented on an NVIDIA Tesla V100 GPU. We obtain up to 6. 5x speedup compared to 64 CPU threads run on two 2.3 GHz 16-core Intel Xeon Processors E5-2698 v3. In our implementation we focused on the alignment of smaller sequences, as the CONSENT segmentation strategy based on k-mer chaining provides an optimal opportunity to exploit the parallel-processing power of GPUs. To demonstrate this, we have integrated our kernel in the CONSENT software. This accelerated version of CONSENT provides a speedup for the whole error correction step that ranges from 1. 95x to 8. 5x depending on the input reads

eScholarship - University of California

GPU accelerated partial order multiple sequence alignment for long reads self-correction

Author: Buluc A
Ding N
Hofmeyr S
Oliker L
Peverelli F
Santambrogio MD
Tucci LD
Yelick K
Publication venue: eScholarship, University of California
Publication date: 01/05/2020
Field of study

Crossref

eScholarship - University of California