17 research outputs found
DAD vision: opto-electronic co-designed computer vision with division adjoint method
The miniaturization and mobility of computer vision systems are limited by
the heavy computational burden and the size of optical lenses. Here, we propose
to use a ultra-thin diffractive optical element to implement passive optical
convolution. A division adjoint opto-electronic co-design method is also
proposed. In our simulation experiments, the first few convolutional layers of
the neural network can be replaced by optical convolution in a classification
task on the CIFAR-10 dataset with no power consumption, while similar
performance can be obtained
Deep learning-enabled framework for automatic lens design starting point generation
We present a simple, highly modular deep neural network (DNN) framework to
address the problem of automatically inferring lens design starting points tailored to the desired
specifications. In contrast to previous work, our model can handle various and complex lens
structures suitable for real-world problems such as Cooke Triplets or Double Gauss lenses. Our
successfully trained dynamic model can infer lens designs with realistic glass materials whose
optical performance compares favorably to reference designs from the literature on 80 different
lens structures. Using our trained model as a backbone, we make available to the community a
web application that outputs a selection of varied, high-quality starting points directly from the
desired specifications, which we believe will complement any lens designer’s toolbox
RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image
High dynamic range (HDR) images capture much more intensity levels than
standard ones. Current methods predominantly generate HDR images from 8-bit low
dynamic range (LDR) sRGB images that have been degraded by the camera
processing pipeline. However, it becomes a formidable task to retrieve
extremely high dynamic range scenes from such limited bit-depth data. Unlike
existing methods, the core idea of this work is to incorporate more informative
Raw sensor data to generate HDR images, aiming to recover scene information in
hard regions (the darkest and brightest areas of an HDR scene). To this end, we
propose a model tailor-made for Raw images, harnessing the unique features of
Raw data to facilitate the Raw-to-HDR mapping. Specifically, we learn exposure
masks to separate the hard and easy regions of a high dynamic scene. Then, we
introduce two important guidances, dual intensity guidance, which guides less
informative channels with more informative ones, and global spatial guidance,
which extrapolates scene specifics over an extended spatial domain. To verify
our Raw-to-HDR approach, we collect a large Raw/HDR paired dataset for both
training and testing. Our empirical evaluations validate the superiority of the
proposed Raw-to-HDR reconstruction model, as well as our newly captured dataset
in the experiments.Comment: ICCV 202
Inferring the solution space of microscope objective lenses using deep learning
Lens design extrapolation (LDE) is a data-driven approach to optical design that aims to generate new optical systems inspired by reference designs. Here, we build on a deep learning-enabled LDE framework with the aim of generating a significant variety of microscope objective lenses (MOLs) that are similar in structure to the reference MOLs, but with varied sequences—defined as a particular arrangement of glass elements, air gaps, and aperture stop placement. We first formulate LDE as a one-to-many problem—specifically, generating varied lenses for any set of specifications and lens sequence. Next, by quantifying the structure of a MOL from the slopes of its marginal ray, we improve the training objective to capture the structures of the reference MOLs (e.g., Double-Gauss, Lister, retrofocus, etc.). From only 34 reference MOLs, we generate designs across 7432 lens sequences and show that the inferred designs accurately capture the structural diversity and performance of the dataset. Our contribution answers two current challenges of the LDE framework: incorporating a meaningful one-to-many mapping, and successfully extrapolating to lens sequences unseen in the dataset—a problem much harder than the one of extrapolating to new specifications
DISeR: Designing Imaging Systems with Reinforcement Learning
Imaging systems consist of cameras to encode visual information about the
world and perception models to interpret this encoding. Cameras contain (1)
illumination sources, (2) optical elements, and (3) sensors, while perception
models use (4) algorithms. Directly searching over all combinations of these
four building blocks to design an imaging system is challenging due to the size
of the search space. Moreover, cameras and perception models are often designed
independently, leading to sub-optimal task performance. In this paper, we
formulate these four building blocks of imaging systems as a context-free
grammar (CFG), which can be automatically searched over with a learned camera
designer to jointly optimize the imaging system with task-specific perception
models. By transforming the CFG to a state-action space, we then show how the
camera designer can be implemented with reinforcement learning to intelligently
search over the combinatorial space of possible imaging system configurations.
We demonstrate our approach on two tasks, depth estimation and camera rig
design for autonomous vehicles, showing that our method yields rigs that
outperform industry-wide standards. We believe that our proposed approach is an
important step towards automating imaging system design.Comment: ICCV 2023. Project Page: https://tzofi.github.io/dise