537 research outputs found
Recommended from our members
Communication software code generation
This report describe the implementation of system-level communication on a programmable processor. First, the issues are introduced using the example of communication software on a Motorola DSP. Then, the problem is generalized and defined for the general case of system-level communication on a programmable processor
Allergen immunotherapy in children and adolescents: current aspects 2024
Allergen immunotherapy (AIT) is a proven treatment for allergic diseases such as allergic rhinoconjunctivitis (ARC), allergic asthma (AA) and insect sting allergy. Particularly in children and adolescents, who have a high prevalence of these diseases, AIT plays a crucial role in not only alleviating symptoms but also influencing the natural course of the disease. This article examines the use and importance of AIT in children and adolescents in Germany in the final phase of the Therapy Allergen Ordinance (TAV). The focus is on the efficacy and safety of the therapy, as well as the approval of the respective therapeutic allergens for the respective age group
A Survey of Distributed Learning in Cloud, Mobile, and Edge Settings
In the era of deep learning (DL), convolutional neural networks (CNNs), and
large language models (LLMs), machine learning (ML) models are becoming
increasingly complex, demanding significant computational resources for both
inference and training stages. To address this challenge, distributed learning
has emerged as a crucial approach, employing parallelization across various
devices and environments. This survey explores the landscape of distributed
learning, encompassing cloud and edge settings. We delve into the core concepts
of data and model parallelism, examining how models are partitioned across
different dimensions and layers to optimize resource utilization and
performance. We analyze various partitioning schemes for different layer types,
including fully connected, convolutional, and recurrent layers, highlighting
the trade-offs between computational efficiency, communication overhead, and
memory constraints. This survey provides valuable insights for future research
and development in this rapidly evolving field by comparing and contrasting
distributed learning approaches across diverse contexts
Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs
FPGAs are a promising platform for accelerating Deep Learning (DL)
applications, due to their high performance, low power consumption, and
reconfigurability. Recently, the leading FPGA vendors have enhanced their
architectures to more efficiently support the computational demands of DL
workloads. However, the two most prominent AI-optimized FPGAs, i.e., AMD/Xilinx
Versal ACAP and Intel Stratix 10 NX, employ significantly different
architectural approaches. This paper presents novel systematic frameworks to
optimize the performance of General Matrix Multiplication (GEMM), a fundamental
operation in DL workloads, by exploiting the unique and distinct architectural
characteristics of each FPGA. Our evaluation on GEMM workloads for int8
precision shows up to 77 and 68 TOPs (int8) throughput, with up to 0.94 and
1.35 TOPs/W energy efficiency for Versal VC1902 and Stratix 10 NX,
respectively. This work provides insights and guidelines for optimizing
GEMM-based applications on both platforms, while also delving into their
programmability trade-offs and associated challenges.Comment: Accepted as full paper at FCCM 202
Suitability of new Information and Communication Technologies for the increase of the internationality of research and development - opportunities and limits
Die vorliegende Arbeit betrachtet den Zusammenhang zwischen der Internationalisierung unternehmensinterner Forschungs- und Entwicklungsaktivitäten sowie neuer Informations- und Kommunikationstechnik. Es lässt sich feststellen, dass der Bereich der Forschung und Entwicklung bisher nicht in dem Maße internationalisiert ist, wie dies etwa bei anderen Funktionsbereichen festzustellen ist. Neue Informations- und Kommunikationstechnik erlaubt hingegen einen weitreichend ungehinderten grenzüberschreitenden Informationsaustausch. Auf Basis des Informationsverarbeitungsansatzes werden Aussagen bezüglich der Möglichkeiten und Grenzen der neuen Informations - und Kommunikationstechnik in bezug auf eine Erhöhung der Internationalität von Forschungs- und Entwicklungsaktivitäten generiert. Hierzu werden einerseits Barrieren der Internationalisierung, andererseits typische Konfigurationsmuster internationaler F&E-Aktivitäten untersucht.This study deals with the connection between the internationalization of corporate research and development activities on the one hand and new information and communication technologies on the other hand. It can be stated that research and development so far has not been internationalized to the extent witch has been perceived in other functional areas. New information and communication technologies, however, permit an extensively unhindered transnational information exchange. Based on the information processing theory statements are generated regarding the opportunities and limits of the new information and communication technologies in the context of an increase of the internationalization of research and development. In order to do so barriers of the internationalization are examined, as well as typical configuration samples of international R&D activities
Energy Optimization in NCFET-based Processors
Energy consumption is a key optimization goal for all modern processors. Negative Capacitance Field-Effect Transistors (NCFETs) are a leading emerging technology that promises outstanding performance in addition to better energy efficiency. Thickness of the additional ferroelectric layer, frequency, and voltage are the key parameters in NCFET technology that impact the power and frequency of processors. However, their joint impact on energy optimization has not been investigated yet.In this work, we are the first to demonstrate that conventional (i.e., NCFET-unaware) dynamic voltage/frequency scaling (DVFS) techniques to minimize energy are sub-optimal when applied to NCFET-based processors. We further demonstrate that state-of-the-art NCFET-aware voltage scaling for power minimization is also sub-optimal when it comes to energy. This work provides the first NCFET-aware DVFS technique that optimizes the processor\u27s energy through optimal runtime frequency/voltage selection. In NCFETs, energy-optimal frequency and voltage are dependent on the workload and technology parameters. Our NCFET-aware DVFS technique considers these effects to perform optimal voltage/frequency selection at runtime depending on workload characteristics. Results show up to 90 % energy savings compared to conventional DVFS techniques. Compared to state-of-the-art NCFET-aware power management, our technique provides up to 72 % energy savings along with 3.7x higher performance
- …
