10,986 research outputs found
Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques
The rapid growth of demanding applications in domains applying multimedia
processing and machine learning has marked a new era for edge and cloud
computing. These applications involve massive data and compute-intensive tasks,
and thus, typical computing paradigms in embedded systems and data centers are
stressed to meet the worldwide demand for high performance. Concurrently, the
landscape of the semiconductor field in the last 15 years has constituted power
as a first-class design concern. As a result, the community of computing
systems is forced to find alternative design approaches to facilitate
high-performance and/or power-efficient computing. Among the examined
solutions, Approximate Computing has attracted an ever-increasing interest,
with research works applying approximations across the entire traditional
computing stack, i.e., at software, hardware, and architectural levels. Over
the last decade, there is a plethora of approximation techniques in software
(programs, frameworks, compilers, runtimes, languages), hardware (circuits,
accelerators), and architectures (processors, memories). The current article is
Part I of our comprehensive survey on Approximate Computing, and it reviews its
motivation, terminology and principles, as well it classifies and presents the
technical details of the state-of-the-art software and hardware approximation
techniques.Comment: Under Review at ACM Computing Survey
Beam scanning by liquid-crystal biasing in a modified SIW structure
A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium
ACOUSTIC SPEECH MARKERS FOR TRACKING CHANGES IN HYPOKINETIC DYSARTHRIA ASSOCIATED WITH PARKINSON’S DISEASE
Previous research has identified certain overarching features of hypokinetic dysarthria
associated with Parkinson’s Disease and found it manifests differently between
individuals. Acoustic analysis has often been used to find correlates of perceptual
features for differential diagnosis. However, acoustic parameters that are robust for
differential diagnosis may not be sensitive to tracking speech changes. Previous
longitudinal studies have had limited sample sizes or variable lengths between data
collection. This study focused on using acoustic correlates of perceptual features to
identify acoustic markers able to track speech changes in people with Parkinson’s
Disease (PwPD) over six months. The thesis presents how this study has addressed
limitations of previous studies to make a novel contribution to current knowledge.
Speech data was collected from 63 PwPD and 47 control speakers using an online
podcast software at two time points, six months apart (T1 and T2). Recordings of a
standard reading passage, minimal pairs, sustained phonation, and spontaneous speech
were collected. Perceptual severity ratings were given by two speech and language
therapists for T1 and T2, and acoustic parameters of voice, articulation and prosody
were investigated. Two analyses were conducted: a) to identify which acoustic
parameters can track perceptual speech changes over time and b) to identify which
acoustic parameters can track changes in speech intelligibility over time. An additional
attempt was made to identify if these parameters showed group differences for
differential diagnosis between PwPD and control speakers at T1 and T2.
Results showed that specific acoustic parameters in voice quality, articulation and
prosody could differentiate between PwPD and controls, or detect speech changes
between T1 and T2, but not both factors. However, specific acoustic parameters within
articulation could detect significant group and speech change differences across T1 and
T2. The thesis discusses these results, their implications, and the potential for future
studies
Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO
This paper proposes a grant-free massive access scheme based on the
millimeter wave (mmWave) extra-large-scale multiple-input multiple-output
(XL-MIMO) to support massive Internet-of-Things (IoT) devices with low latency,
high data rate, and high localization accuracy in the upcoming sixth-generation
(6G) networks. The XL-MIMO consists of multiple antenna subarrays that are
widely spaced over the service area to ensure line-of-sight (LoS)
transmissions. First, we establish the XL-MIMO-based massive access model
considering the near-field spatial non-stationary (SNS) property. Then, by
exploiting the block sparsity of subarrays and the SNS property, we propose a
structured block orthogonal matching pursuit algorithm for efficient active
user detection (AUD) and channel estimation (CE). Furthermore, different
sensing matrices are applied in different pilot subcarriers for exploiting the
diversity gains. Additionally, a multi-subarray collaborative localization
algorithm is designed for localization. In particular, the angle of arrival
(AoA) and time difference of arrival (TDoA) of the LoS links between active
users and related subarrays are extracted from the estimated XL-MIMO channels,
and then the coordinates of active users are acquired by jointly utilizing the
AoAs and TDoAs. Simulation results show that the proposed algorithms outperform
existing algorithms in terms of AUD and CE performance and can achieve
centimeter-level localization accuracy.Comment: Submitted to IEEE Transactions on Communications, Major revision.
Codes will be open to all on https://gaozhen16.github.io/ soo
Multilink and AUV-Assisted Energy-Efficient Underwater Emergency Communications
Recent development in wireless communications has provided many reliable
solutions to emergency response issues, especially in scenarios with
dysfunctional or congested base stations. Prior studies on underwater emergency
communications, however, remain under-studied, which poses a need for combining
the merits of different underwater communication links (UCLs) and the
manipulability of unmanned vehicles. To realize energy-efficient underwater
emergency communications, we develop a novel underwater emergency communication
network (UECN) assisted by multiple links, including underwater light,
acoustic, and radio frequency links, and autonomous underwater vehicles (AUVs)
for collecting and transmitting underwater emergency data. First, we determine
the optimal emergency response mode for an underwater sensor node (USN) using
greedy search and reinforcement learning (RL), so that isolated USNs (I-USNs)
can be identified. Second, according to the distribution of I-USNs, we dispatch
AUVs to assist I-USNs in data transmission, i.e., jointly optimizing the
locations and controls of AUVs to minimize the time for data collection and
underwater movement. Finally, an adaptive clustering-based multi-objective
evolutionary algorithm is proposed to jointly optimize the number of AUVs and
the transmit power of I-USNs, subject to a given set of constraints on transmit
power, signal-to-interference-plus-noise ratios (SINRs), outage probabilities,
and energy, which achieves the best tradeoff between the maximum emergency
response time (ERT) and the total energy consumption (EC). Simulation results
indicate that our proposed approach outperforms benchmark schemes in terms of
energy efficiency (EE), contributing to underwater emergency communications.Comment: 15 page
Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
Perfect synchronization in distributed machine learning problems is
inefficient and even impossible due to the existence of latency, package losses
and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient
Tracking method (R-FAST), where each device performs local computation and
communication at its own pace without any form of synchronization. Different
from existing asynchronous distributed algorithms, R-FAST can eliminate the
impact of data heterogeneity across devices and allow for packet losses by
employing a robust gradient tracking strategy that relies on properly designed
auxiliary variables for tracking and buffering the overall gradient vector.
More importantly, the proposed method utilizes two spanning-tree graphs for
communication so long as both share at least one common root, enabling flexible
designs in communication architectures. We show that R-FAST converges in
expectation to a neighborhood of the optimum with a geometric rate for smooth
and strongly convex objectives; and to a stationary point with a sublinear rate
for general non-convex settings. Extensive experiments demonstrate that R-FAST
runs 1.5-2 times faster than synchronous benchmark algorithms, such as
Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and
outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP,
especially in the presence of stragglers
Computation and Communication Efficient Federated Learning over Wireless Networks
Federated learning (FL) allows model training from local data by edge devices
while preserving data privacy. However, the learning accuracy decreases due to
the heterogeneity of devices data, and the computation and communication
latency increase when updating large scale learning models on devices with
limited computational capability and wireless resources. To overcome these
challenges, we consider a novel FL framework with partial model pruning and
personalization. This framework splits the learning model into a global part
with model pruning shared with all devices to learn data representations and a
personalized part to be fine tuned for a specific device, which adapts the
model size during FL to reduce both computation and communication overhead and
minimize the overall training time, and increases the learning accuracy for the
device with non independent and identically distributed (non IID) data. Then,
the computation and communication latency and the convergence analysis of the
proposed FL framework are mathematically analyzed. Based on the convergence
analysis, an optimization problem is formulated to maximize the convergence
rate under a latency threshold by jointly optimizing the pruning ratio and
wireless resource allocation. By decoupling the optimization problem and
deploying Karush Kuhn Tucker (KKT) conditions, we derive the closed form
solutions of pruning ratio and wireless resource allocation. Finally,
experimental results demonstrate that the proposed FL framework achieves a
remarkable reduction of approximately 50 percents computation and communication
latency compared with the scheme only with model personalization.Comment: arXiv admin note: text overlap with arXiv:2305.0904
- …