10,278 research outputs found
A Dual Sensor Computational Camera for High Quality Dark Videography
Videos captured under low light conditions suffer from severe noise. A
variety of efforts have been devoted to image/video noise suppression and made
large progress. However, in extremely dark scenarios, extensive photon
starvation would hamper precise noise modeling. Instead, developing an imaging
system collecting more photons is a more effective way for high-quality video
capture under low illuminations. In this paper, we propose to build a
dual-sensor camera to additionally collect the photons in NIR wavelength, and
make use of the correlation between RGB and near-infrared (NIR) spectrum to
perform high-quality reconstruction from noisy dark video pairs. In hardware,
we build a compact dual-sensor camera capturing RGB and NIR videos
simultaneously. Computationally, we propose a dual-channel multi-frame
attention network (DCMAN) utilizing spatial-temporal-spectral priors to
reconstruct the low-light RGB and NIR videos. In addition, we build a
high-quality paired RGB and NIR video dataset, based on which the approach can
be applied to different sensors easily by training the DCMAN model with
simulated noisy input following a physical-process-based CMOS noise model. Both
experiments on synthetic and real videos validate the performance of this
compact dual-sensor camera design and the corresponding reconstruction
algorithm in dark videography
Development of a fundus camera for analysis of photoreceptor directionality in the healthy retina
The Stiles-Crawford effect (SCE) is the well-known phenomenon in which the brightness of light perceived by the human eye depends upon its entrance point in the pupil. This physiological characteristic is due to the directional sensitivity of the cone photoreceptors in the retina and it displays an approximately Gaussian dependency which is altered in a number of pathologies. Retinal imaging, a widely spread clinical practice, may be used to evaluate the SCE and thus serve as diagnostic tool. Nonetheless, its use for such a purpose is still underdeveloped and far from the clinical reality.
In this project a fundus camera was built and used to assess the cone photoreceptor directionality by reflective imaging of the retina in healthy individuals. The physical and physiological implications of its development are addressed in detail in the text: the optical properties of the human eye, illumination issues, acquiring a retinal image formed by the eye, among others. A full description of the developmental process that led to the final measuring method and results is also given.
The developed setup was successfully used to obtain high quality images of the eye fundus and in particular the parafoveal cone photoreceptors. The SCE was successfully observed and characterized. Even though considerable improvements could be done to the measurement method, the project showed the feasibility of using retinal imaging to evaluate the SCE thus motivating its usage in a clinical environment
Detection of AI-created images using pixel-wise feature extraction and convolutional neural networks
Generative AI has gained enormous interest nowadays due to new applications like ChatGPT, DALL E, Stable Diffusion, and Deepfake. In particular, DALL E, Stable Diffusion, and others (Adobe Firefly, ImagineArt, etc.) can create images from a text prompt and are even able to create photorealistic images. Due to this fact, intense research has been performed to create new image forensics applications able to distinguish between real captured images and videos and artificial ones. Detecting forgeries made with Deepfake is one of the most researched issues. This paper is about another kind of forgery detection. The purpose of this research is to detect photorealistic AI-created images versus real photos coming from a physical camera. Id est, making a binary decision over an image, asking whether it is artificially or naturally created. Artificial images do not need to try to represent any real object, person, or place. For this purpose, techniques that perform a pixel-level feature extraction are used. The first one is Photo Response Non-Uniformity (PRNU). PRNU is a special noise due to imperfections on the camera sensor that is used for source camera identification. The underlying idea is that AI images will have a different PRNU pattern. The second one is error level analysis (ELA). This is another type of feature extraction traditionally used for detecting image editing. ELA is being used nowadays by photographers for the manual detection of AI-created images. Both kinds of features are used to train convolutional neural networks to differentiate between AI images and real photographs. Good results are obtained, achieving accuracy rates of over 95%. Both extraction methods are carefully assessed by computing precision/recall and F1-score measurements
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
- …