74,318 research outputs found
Task-Oriented Over-the-Air Computation for Multi-Device Edge AI
Departing from the classic paradigm of data-centric designs, the 6G networks
for supporting edge AI features task-oriented techniques that focus on
effective and efficient execution of AI task. Targeting end-to-end system
performance, such techniques are sophisticated as they aim to seamlessly
integrate sensing (data acquisition), communication (data transmission), and
computation (data processing). Aligned with the paradigm shift, a task-oriented
over-the-air computation (AirComp) scheme is proposed in this paper for
multi-device split-inference system. In the considered system, local feature
vectors, which are extracted from the real-time noisy sensory data on devices,
are aggregated over-the-air by exploiting the waveform superposition in a
multiuser channel. Then the aggregated features as received at a server are fed
into an inference model with the result used for decision making or control of
actuators. To design inference-oriented AirComp, the transmit precoders at edge
devices and receive beamforming at edge server are jointly optimized to rein in
the aggregation error and maximize the inference accuracy. The problem is made
tractable by measuring the inference accuracy using a surrogate metric called
discriminant gain, which measures the discernibility of two object classes in
the application of object/event classification. It is discovered that the
conventional AirComp beamforming design for minimizing the mean square error in
generic AirComp with respect to the noiseless case may not lead to the optimal
classification accuracy. The reason is due to the overlooking of the fact that
feature dimensions have different sensitivity towards aggregation errors and
are thus of different importance levels for classification. This issue is
addressed in this work via a new task-oriented AirComp scheme designed by
directly maximizing the derived discriminant gain
Non-local Neural Networks
Both convolutional and recurrent operations are building blocks that process
one local neighborhood at a time. In this paper, we present non-local
operations as a generic family of building blocks for capturing long-range
dependencies. Inspired by the classical non-local means method in computer
vision, our non-local operation computes the response at a position as a
weighted sum of the features at all positions. This building block can be
plugged into many computer vision architectures. On the task of video
classification, even without any bells and whistles, our non-local models can
compete or outperform current competition winners on both Kinetics and Charades
datasets. In static image recognition, our non-local models improve object
detection/segmentation and pose estimation on the COCO suite of tasks. Code is
available at https://github.com/facebookresearch/video-nonlocal-net .Comment: CVPR 2018, code is available at:
https://github.com/facebookresearch/video-nonlocal-ne
- …