Search CORE

17 research outputs found

Nonlinear-Cost Random Walk: exact statistics of the distance covered for fixed budget

Author: Majumdar Satya N.
Mori Francesco
Vivo Pierpaolo
Publication venue
Publication date: 13/10/2023
Field of study

We consider the Nonlinear-Cost Random Walk model in discrete time introduced in [Phys. Rev. Lett. 130, 237102 (2023)], where a fee is charged for each jump of the walker. The nonlinear cost function is such that slow/short jumps incur a flat fee, while for fast/long jumps the cost is proportional to the distance covered. In this paper we compute analytically the average and variance of the distance covered in

n

steps when the total budget

C

is fixed, as well as the statistics of the number of long/short jumps in a trajectory of length

n

, for the exponential jump distribution. These observables exhibit a very rich and non-monotonic scaling behavior as a function of the variable

C/n

, which is traced back to the makeup of a typical trajectory in terms of long/short jumps, and the resulting "entropy" thereof. As a byproduct, we compute the asymptotic behavior of ratios of Kummer hypergeometric functions when both the first and last arguments are large. All our analytical results are corroborated by numerical simulations.Comment: 31 pages, 8 figure

arXiv.org e-Print Archive

The cost of diffusion: nonlinearity and giant fluctuations

Author: Majumdar Satya N.
Mori Francesco
Vivo Pierpaolo
Publication venue
Publication date: 15/05/2023
Field of study

King's Research Portal

HW-Flow-Fusion: Inter-Layer Scheduling for Convolutional Neural Network Accelerators with Dataflow Architectures

Author: Fasfous Nael
Frickenstein Alexander
Frickenstein Lukas
Martina Maurizio
Masera Guido
Mori Pierpaolo
Passerone Claudio
Stechele Walter
Valpreda Emanuele
Vemparala Manoj Rohit
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Energy and throughput efficient acceleration of convolutional neural networks (CNN) on devices with a strict power budget is achieved by leveraging different scheduling techniques to minimize data movement and maximize data reuse. Several dataflow mapping frameworks have been developed to explore the optimal scheduling of CNN layers on reconfigurable accelerators. However, previous works usually optimize each layer singularly, without leveraging the data reuse between the layers of CNNs. In this work, we present an analytical model to achieve efficient data reuse by searching for efficient scheduling of communication and computation across layers. We call this inter-layer scheduling framework HW-Flow-Fusion, as we explore the fused map-space of multiple layers sharing the available resources of the same accelerator, investigating the constraints and trade-offs of mapping the execution of multiple workloads with data dependencies. We propose a memory-efficient data reuse model, tiling, and resource partitioning strategies to fuse multiple layers without recomputation. Compared to standard single-layer scheduling, inter-layer scheduling can reduce the communication volume by 51% and 53% for selected VGG16-E and ResNet18 layers on a spatial array accelerator, and reduce the latency by 39% and 34% respectively, while also increasing the computation to communication ratio which improves the memory bandwidth efficiency

Directory of Open Access Journals

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Pruning as a Binarization Technique

Author: Alexander Frickenstein
Christian Unger
Claudio Passerone
Lukas Frickenstein
Manoj Rohit Vemparala
Moritz Thoma
Nael Fasfous
Pierpaolo Mori
Shambhavi Balamuthu Sampath
Walter Stechele
Publication venue: IEEE
Publication date: 01/01/2024
Field of study

Convolutional neural networks (CNNs) can be quantized to reduce the bit-width of their weights and activations. Pruning is another compression technique, where entire structures are removed from a CNN’s computation graph. Multi-bit networks (MBNs) encode the operands (weights and activations) of the convolution into multiple binary bases, where the bit-width of the particular operand is equal to its number of binary bases. Therefore, this work views pruning an individual binary base in an MBN as a reduction in the bit-width of its operands, i.e. quantization. Although many binarization methods have improved the accuracy of binary neural networks (BNNs) by e.g. minimizing quantization error, improving training strategies or proposing different network architecture designs, we reveal a new viewpoint to achieve high-accuracy BNNs, which leverages pruning as a binarization technique (PaBT). We exploit gradient information that exposes the importance of each binary convolution and its contribution to the loss. We prune entire binary convolutions, reducing the effective bitwidths of the MBN during the training. This ultimately results in a smooth convergence to accurate BNNs. PaBT achieves 2.9 p.p., 1.6 p.p. and 0.9 p.p. better accuracy than SotA BNNs IR-Net, LNS and SiMaN on the ImageNet dataset, respectively. Further, PaBT scales to the more complex task of semantic segmentation, outperforming ABC-Net on the CityScapes dataset. This positions PaBT as a novel high-accuracy binarization scheme, and makes it the first to expose the potential of latent-weight-free training for compression techniques

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

MATAR: Multi-Quantization-Aware Training for Accurate and Fast Hardware Retargeting

Author: Alexander Frickenstein
Claudio Passerone
Daniel Mueller-Gritschneder
Lukas Frickenstein
Manoj Rohit Vemparala
Moritz Thoma
Nael Fasfous
Pierpaolo Mori
Shambhavi Balamuthu Sampath
Walter Stechele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2024
Field of study

Quantization of deep neural networks (DNNs) re- duces their memory footprint and simplifies their hardware arith- metic logic, enabling efficient inference on edge devices. Different hardware targets can support different forms of quantization, e.g. full 8-bit, or 8/4/2-bit mixed-precision combinations, or fully- flexible bit-serial solutions. This makes standard quantization- aware training (QAT) of a DNN for different targets challenging, as there needs to be careful consideration of the supported quantization-levels of each target at training time. In this paper, we propose a generalized QAT solution that results in a DNN which can be retargeted to different hardware, without any retraining or prior knowledge of the hardware’s supported quantization policy. First, we present the novel training scheme which makes the model aware of multiple quantization strategies. Then we demonstrate the retargeting capabilities of the resulting DNN by using a genetic algorithm to search for layer-wise, mixed-precision solutions that maximize performance and/or accuracy on the hardware target, without the need of fine-tuning. By making the DNN agnostic of the final hardware target, our method allows DNNs to be distributed to many users on different hardware platforms, without the need for sharing the training loop or dataset of the DNN developers, nor detailing the hardware capabilities ahead of time by the end-users of the efficient quantized solution. Models trained with our approach can generalize on multiple quantization policies with minimal accuracy degradation compared to target- specific quantization counterparts

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Accelerating and pruning CNNs for semantic segmentation on FPGA

Author: Fasfous Nael
Frickenstein Alexander
Frickenstein Lukas
Helms Domenik
Mitra Saptarshi
Mori Pierpaolo
Nagaraja Naveen-Shankar
Passerone Claudio
Sarkar Sreetama
Stechele Walter
Vemparala Manoj-Rohit
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Semantic segmentation is one of the popular tasks in computer vision, providing pixel-wise annotations for scene understanding. However, segmentation-based convolutional neural networks require tremendous computational power. In this work, a fully-pipelined hardware accelerator with support for dilated convolution is introduced, which cuts down the redundant zero multiplications. Furthermore, we propose a genetic algorithm based automated channel pruning technique to jointly optimize computational complexity and model accuracy. Finally, hardware heuristics and an accurate model of the custom accelerator design enable a hardware-aware pruning framework. We achieve 2.44X lower latency with minimal degradation in semantic prediction quality (−1.98 pp lower mean intersection over union) compared to the baseline DeepLabV3+ model, evaluated on an Arria-10 FPGA. The binary files of the FPGA design, baseline and pruned models can be found in github.com/pierpaolomori/SemanticSegmentationFPGA

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Wino Vidi Vici: Conquering Numerical Instability of 8-Bit Winograd Convolution for Accurate Inference Acceleration on Edge

Author: Alexander Frickenstein
Christian Unger
Claudio Passerone
Daniel Mueller-Gritschneder
Lukas Frickenstein
Manoj Rohit Vemparala
Moritz Thoma
Nael Fasfous
Pierpaolo Mori
Shambhavi Balamuthu Sampath
Walter Stechele
Publication venue: IEEE
Publication date: 01/01/2024
Field of study

Winograd-based convolution can reduce the total number of operations needed for convolutional neural network (CNN) inference on edge devices. Most edge hardware accelerators use low-precision, 8-bit integer arithmetic units to improve energy efficiency and latency. This makes CNN quantization a critical step before deploying the model on such an edge device. To extract the benefits of fast Winograd-based convolution and efficient integer quantization, the two approaches must be combined. Research has shown that the transform required to execute convolutions in the Winograd domain results in numerical instability and severe accuracy degradation when combined with quantization, making the two techniques incompatible on edge hardware. This paper proposes a novel training scheme to achieve efficient Winograd-accelerated, quantized CNNs. 8-bit quantization is applied to all the intermediate results of the Winograd convolution without sacrificing task-related accuracy. This is achieved by introducing clipping factors in the intermediate quantization stages as well as using the complex numerical system to improve the transform. We achieve 2.8x and 2.1x reduction in MAC operations on ResNet-20-CIFAR-10 and ResNet-18-ImageNet, respectively, with no accuracy degradation

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Bladder metastases of appendiceal mucinous adenocarcinoma: a case presentation

Author: Alessando Piccinelli
Alessio Benetti
B Baeta
B Rampone
C Arisawa
DP Dalton
Fabio Grizzi
Gianluigi Taverna
Guido Giusti
I Ikeda
Matteo Corinti
Mauro Severo
N Mori
NS Salemis
Paolo A Zucali
Piergiuseppe Colombo
Pierpaolo Graziotti
R Henry
R Nishio
SE Dahms
SI Rahman
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Appendiceal adenocarcinoma is rare with a frequency of 0.08% of all surgically removed appendices. Few cases of appendiceal carcinoma infiltrating the bladder wall for spatial contiguity have been documented. Case Presentation A case is reported of a 45-years old woman with mucinous cystadenocarcinoma of the appendix with bladder metastasis. Although ultrasonography and voided urinary cytology were negative, abdomen computed tomography (CT) scan and cystoscopy and subsequent pathological examination revealed a mass exclusively located in the anterior wall of the bladder. Histopathology of the transurethral bladder resection revealed a bladder adenocarcinoma [6 cm (at the maximum diameter) × 2,5 cm; approximate weight: 10 gr] with focal mucinous aspects penetrating the muscle and perivisceral fat. Laparotomy evidenced the presence of a solid mass of the appendix (2,5 cm × 3 cm × 2 cm) extending to the loco-regional lymph nodes. Appendectomy and right hemicolectomy, linfoadenectomy and partial cystectomy were performed. The subsequent pathological examination revealed a mucinous cystadenocarcinoma of the appendix with metastatic cells colonising the anterior bladder wall and several colic lymph nodes. Conclusions The rarity of the appendiceal carcinoma invading the urinary bladder and its usual involvement of nearest organs and the posterior bladder wall, led us to describe this case which demonstrates the ability of the appendiceal cancer to metastasize different regions of urinary bladder.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Molecular insights to the bioactive form of BV02, a reference inhibitor of 14-3-3σ protein-protein interactions

Author: BOTTA MAURIZIO
CALANDRO PIERPAOLO
CAU YLENIA
Chiariello Mario
DELLO IACONO LUCIA
MORI MATTIA
VALENSIN DANIELA
Vignaroli Giulia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

BV02 is a reference inhibitor of 14-3-3 protein-protein interactions, which is currently used as chemical biology tool to understand the role of 14-3-3 proteins in pathological contexts. Due to chemical instability in certain conditions, its bioactive form has remained unclear. Here, we use NMR spectroscopy to prove for the first time the direct interaction between the molecule and 14-3-3σ, and to depict its bioactive form, namely the phthalimide derivative 9. Our work provides molecular insights to the bioactive form of the 14-3-3 PPI inhibitor and facilitates further development as candidate therapeutic agent

Crossref

Archivio della Ricerca - Università degli Studi di Siena