The precise combination of image sensor and micro-lens array enables lenslet
light field cameras to record both angular and spatial information of incoming
light, therefore, one can calculate disparity and depth from light field
images. In turn, 3D models of the recorded objects can be recovered, which is a
great advantage over other imaging system. However, reflective and texture-less
areas in light field images have complicated conditions, making it hard to
correctly calculate disparity with existing algorithms. To tackle this problem,
we introduce a novel end-to-end network VommaNet to retrieve multi-scale
features from reflective and texture-less regions for accurate disparity
estimation. Meanwhile, our network has achieved similar or better performance
in other regions for both synthetic light field images and real-world data
compared to the state-of-the-art algorithms. Currently, we achieve the best
score for mean squared error (MSE) on HCI 4D Light Field Benchmark