1,807 research outputs found
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
Accessible machine learning algorithms, software, and diagnostic tools for
energy-efficient devices and systems are extremely valuable across a broad
range of application domains. In scientific domains, real-time near-sensor
processing can drastically improve experimental design and accelerate
scientific discoveries. To support domain scientists, we have developed hls4ml,
an open-source software-hardware codesign workflow to interpret and translate
machine learning algorithms for implementation with both FPGA and ASIC
technologies. We expand on previous hls4ml work by extending capabilities and
techniques towards low-power implementations and increased usability: new
Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long
pipeline kernels for low power, and new device backends include an ASIC
workflow. Taken together, these and continued efforts in hls4ml will arm a new
generation of domain scientists with accessible, efficient, and powerful tools
for machine-learning-accelerated discovery.Comment: 10 pages, 8 figures, TinyML Research Symposium 202
Real-Time Object Tracking via Meta-Learning: Efficient Model Adaptation and One-Shot Channel Pruning
We propose a novel meta-learning framework for real-time object tracking with
efficient model adaptation and channel pruning. Given an object tracker, our
framework learns to fine-tune its model parameters in only a few iterations of
gradient-descent during tracking while pruning its network channels using the
target ground-truth at the first frame. Such a learning problem is formulated
as a meta-learning task, where a meta-tracker is trained by updating its
meta-parameters for initial weights, learning rates, and pruning masks through
carefully designed tracking simulations. The integrated meta-tracker greatly
improves tracking performance by accelerating the convergence of online
learning and reducing the cost of feature computation. Experimental evaluation
on the standard datasets demonstrates its outstanding accuracy and speed
compared to the state-of-the-art methods.Comment: 9 pages, 5 figures, AAAI 2020 accepte
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
The challenging deployment of compute-intensive applications from domains
such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces
the community of computing systems to explore new design approaches.
Approximate Computing appears as an emerging solution, allowing to tune the
quality of results in the design of a system in order to improve the energy
efficiency and/or performance. This radical paradigm shift has attracted
interest from both academia and industry, resulting in significant research on
approximation techniques and methodologies at different design layers (from
system down to integrated circuits). Motivated by the wide appeal of
Approximate Computing over the last 10 years, we conduct a two-part survey to
cover key aspects (e.g., terminology and applications) and review the
state-of-the art approximation techniques from all layers of the traditional
computing stack. In Part II of our survey, we classify and present the
technical details of application-specific and architectural approximation
techniques, which both target the design of resource-efficient
processors/accelerators & systems. Moreover, we present a detailed analysis of
the application spectrum of Approximate Computing and discuss open challenges
and future directions.Comment: Under Review at ACM Computing Survey
Medical imaging analysis with artificial neural networks
Given that neural networks have been widely reported in the research community of medical imaging, we provide a focused literature survey on recent neural network developments in computer-aided diagnosis, medical image segmentation and edge detection towards visual content analysis, and medical image registration for its pre-processing and post-processing, with the aims of increasing awareness of how neural networks can be applied to these areas and to provide a foundation for further research and practical development. Representative techniques and algorithms are explained in detail to provide inspiring examples illustrating: (i) how a known neural network with fixed structure and training procedure could be applied to resolve a medical imaging problem; (ii) how medical images could be analysed, processed, and characterised by neural networks; and (iii) how neural networks could be expanded further to resolve problems relevant to medical imaging. In the concluding section, a highlight of comparisons among many neural network applications is included to provide a global view on computational intelligence with neural networks in medical imaging
A Lite Distributed Semantic Communication System for Internet of Things
The rapid development of deep learning (DL) and widespread applications of
Internet-of-Things (IoT) have made the devices smarter than before, and enabled
them to perform more intelligent tasks. However, it is challenging for any IoT
device to train and run DL models independently due to its limited computing
capability. In this paper, we consider an IoT network where the cloud/edge
platform performs the DL based semantic communication (DeepSC) model training
and updating while IoT devices perform data collection and transmission based
on the trained model. To make it affordable for IoT devices, we propose a lite
distributed semantic communication system based on DL, named L-DeepSC, for text
transmission with low complexity, where the data transmission from the IoT
devices to the cloud/edge works at the semantic level to improve transmission
efficiency. Particularly, by pruning the model redundancy and lowering the
weight resolution, the L-DeepSC becomes affordable for IoT devices and the
bandwidth required for model weight transmission between IoT devices and the
cloud/edge is reduced significantly. Through analyzing the effects of fading
channels in forward-propagation and back-propagation during the training of
L-DeepSC, we develop a channel state information (CSI) aided training
processing to decrease the effects of fading channels on transmission.
Meanwhile, we tailor the semantic constellation to make it implementable on
capacity-limited IoT devices. Simulation demonstrates that the proposed
L-DeepSC achieves competitive performance compared with traditional methods,
especially in the low signal-to-noise (SNR) region. In particular, while it can
reach as large as 40x compression ratio without performance degradation.Comment: Accpeted by JSA
- …