651 research outputs found
Path Choice Matters for Clear Attribution in Path Methods
Rigorousness and clarity are both essential for interpretations of DNNs to
engender human trust. Path methods are commonly employed to generate rigorous
attributions that satisfy three axioms. However, the meaning of attributions
remains ambiguous due to distinct path choices. To address the ambiguity, we
introduce \textbf{Concentration Principle}, which centrally allocates high
attributions to indispensable features, thereby endowing aesthetic and
sparsity. We then present \textbf{SAMP}, a model-agnostic interpreter, which
efficiently searches the near-optimal path from a pre-defined set of
manipulation paths. Moreover, we propose the infinitesimal constraint (IC) and
momentum strategy (MS) to improve the rigorousness and optimality.
Visualizations show that SAMP can precisely reveal DNNs by pinpointing salient
image pixels. We also perform quantitative experiments and observe that our
method significantly outperforms the counterparts. Code:
https://github.com/zbr17/SAMP.Comment: ICLR 2024 accepte
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
3D occupancy prediction is an important task for the robustness of
vision-centric autonomous driving, which aims to predict whether each point is
occupied in the surrounding 3D space. Existing methods usually require 3D
occupancy labels to produce meaningful results. However, it is very laborious
to annotate the occupancy status of each voxel. In this paper, we propose
SelfOcc to explore a self-supervised way to learn 3D occupancy using only video
sequences. We first transform the images into the 3D space (e.g., bird's eye
view) to obtain 3D representation of the scene. We directly impose constraints
on the 3D representations by treating them as signed distance fields. We can
then render 2D images of previous and future frames as self-supervision signals
to learn the 3D representations. We propose an MVS-embedded strategy to
directly optimize the SDF-induced weights with multiple depth proposals. Our
SelfOcc outperforms the previous best method SceneRF by 58.7% using a single
frame as input on SemanticKITTI and is the first self-supervised work that
produces reasonable 3D occupancy for surround cameras on nuScenes. SelfOcc
produces high-quality depth and achieves state-of-the-art results on novel
depth synthesis, monocular depth estimation, and surround-view depth estimation
on the SemanticKITTI, KITTI-2015, and nuScenes, respectively. Code:
https://github.com/huang-yh/SelfOcc.Comment: Code is available at: https://github.com/huang-yh/SelfOc
Ultra-high-linearity integrated lithium niobate electro-optic modulators
Integrated lithium niobate (LN) photonics is a promising platform for future
chip-scale microwave photonics systems owing to its unique electro-optic
properties, low optical loss and excellent scalability. A key enabler for such
systems is a highly linear electro-optic modulator that could faithfully covert
analog electrical signals into optical signals. In this work, we demonstrate a
monolithic integrated LN modulator with an ultrahigh spurious-free dynamic
range (SFDR) of 120.04 dB Hz4/5 at 1 GHz, using a ring-assisted Mach-Zehnder
interferometer configuration. The excellent synergy between the intrinsically
linear electro-optic response of LN and an optimized linearization strategy
allows us to fully suppress the cubic terms of third-order intermodulation
distortions (IMD3) without active feedback controls, leading to ~ 20 dB
improvement over previous results in the thin-film LN platform. Our
ultra-high-linearity LN modulators could become a core building block for
future large-scale functional microwave photonic integrated circuits, by
further integration with other high-performance components like low-loss delay
lines, tunable filters and phase shifters available on the LN platform
Exploring Unified Perspective For Fast Shapley Value Estimation
Shapley values have emerged as a widely accepted and trustworthy tool,
grounded in theoretical axioms, for addressing challenges posed by black-box
models like deep neural networks. However, computing Shapley values encounters
exponential complexity in the number of features. Various approaches, including
ApproSemivalue, KernelSHAP, and FastSHAP, have been explored to expedite the
computation. We analyze the consistency of existing works and conclude that
stochastic estimators can be unified as the linear transformation of importance
sampling of feature subsets. Based on this, we investigate the possibility of
designing simple amortized estimators and propose a straightforward and
efficient one, SimSHAP, by eliminating redundant techniques. Extensive
experiments conducted on tabular and image datasets validate the effectiveness
of our SimSHAP, which significantly accelerates the computation of accurate
Shapley values
A power-efficient integrated lithium niobate electro-optic comb generator
Integrated electro-optic (EO) frequency combs are essential components for
future applications in optical communications, light detection and ranging,
optical computation, sensing and spectroscopy. To date, broadband on-chip EO
combs are typically generated in high-quality-factor micro-resonators, while
the more straightforward and flexible non-resonant method, usually using single
or cascaded EO phase modulators, often requires high driving power to realize a
reasonably strong modulation index. Here, we show that the phase modulation
efficiency of an integrated lithium niobate modulator could be dramatically
enhanced by passing optical signals through the modulation electrodes for a
total of 4 round trips, via multiple low-loss TE0/TE1 mode multiplexers and
waveguide crossings, reducing electrical power consumption by more than one
order of magnitude. Using devices fabricated from a wafer-scale stepper
lithography process, we demonstrate a broadband optical frequency comb
featuring 47 comb lines at a 25-GHz repetition rate, using a moderate RF
driving power of 28 dBm (0.63 W). Leveraging the excellent tunability in
repetition rate and operation wavelength, our power-efficient EO comb generator
could serve as a compact low-cost solution for future high-speed data
transmission, sensing and spectroscopy, as well as classical and quantum
optical computation systems.Comment: 9 pages, 4 fingure
OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving
Understanding how the 3D scene evolves is vital for making decisions in
autonomous driving. Most existing methods achieve this by predicting the
movements of object boxes, which cannot capture more fine-grained scene
information. In this paper, we explore a new framework of learning a world
model, OccWorld, in the 3D Occupancy space to simultaneously predict the
movement of the ego car and the evolution of the surrounding scenes. We propose
to learn a world model based on 3D occupancy rather than 3D bounding boxes and
segmentation maps for three reasons: 1) expressiveness. 3D occupancy can
describe the more fine-grained 3D structure of the scene; 2) efficiency. 3D
occupancy is more economical to obtain (e.g., from sparse LiDAR points). 3)
versatility. 3D occupancy can adapt to both vision and LiDAR. To facilitate the
modeling of the world evolution, we learn a reconstruction-based scene
tokenizer on the 3D occupancy to obtain discrete scene tokens to describe the
surrounding scenes. We then adopt a GPT-like spatial-temporal generative
transformer to generate subsequent scene and ego tokens to decode the future
occupancy and ego trajectory. Extensive experiments on the widely used nuScenes
benchmark demonstrate the ability of OccWorld to effectively model the
evolution of the driving scenes. OccWorld also produces competitive planning
results without using instance and map supervision. Code:
https://github.com/wzzheng/OccWorld.Comment: Code is available at: https://github.com/wzzheng/OccWorl
- …