Today's commodity camera systems rely on compound optics to map light
originating from the scene to positions on the sensor where it gets recorded as
an image. To record images without optical aberrations, i.e., deviations from
Gauss' linear model of optics, typical lens systems introduce increasingly
complex stacks of optical elements which are responsible for the height of
existing commodity cameras. In this work, we investigate flat nanophotonic
computational cameras as an alternative that employs an array of skewed
lenslets and a learned reconstruction approach. The optical array is embedded
on a metasurface that, at 700~nm height, is flat and sits on the sensor cover
glass at 2.5~mm focal distance from the sensor. To tackle the highly chromatic
response of a metasurface and design the array over the entire sensor, we
propose a differentiable optimization method that continuously samples over the
visible spectrum and factorizes the optical modulation for different incident
fields into individual lenses. We reconstruct a megapixel image from our flat
imager with a learned probabilistic reconstruction method that employs a
generative diffusion model to sample an implicit prior. To tackle
scene-dependent aberrations in broadband, we propose a method for acquiring
paired captured training data in varying illumination conditions. We assess the
proposed flat camera design in simulation and with an experimental prototype,
validating that the method is capable of recovering images from diverse scenes
in broadband with a single nanophotonic layer.Comment: 18 pages, 12 figures, to be published in ACM Transactions on Graphic