132 research outputs found
The Nonlinear Talbot Effect of Rogue Waves
Akhmediev and Kuznetsov-Ma breathers are rogue wave solutions of the
nonlinear Schr\"odinger equation (NLSE). Talbot effect (TE) is an image
recurrence phenomenon in the diffraction of light waves. We report the
nonlinear TE of rogue waves in a cubic medium. It is different from the linear
TE, in that the wave propagates in a NL medium and is an eigenmode of NLSE.
Periodic rogue waves impinging on a NL medium exhibit recurrent behavior, but
only at the TE length and at the half-TE length with a \pi-phase shift; the
fractional TE is absent. The NL TE is the result of the NL interference of the
lobes of rogue wave breathers. This interaction is related to the transverse
period and intensity of breathers, in that the bigger the period and the higher
the intensity, the shorter the TE length.Comment: 4 pages, 4 figure
Fresnel diffraction patterns as accelerating beams
We demonstrate that beams originating from Fresnel diffraction patterns are
self-accelerating in free space. In addition to accelerating and self-healing,
they also exhibit parabolic deceleration property, which is in stark contrast
to other accelerating beams. We find that the trajectory of Fresnel paraxial
accelerating beams is similar to that of nonparaxial Weber beams. Decelerating
and accelerating regions are separated by a critical propagation distance, at
which no acceleration is present. During deceleration, the Fresnel diffraction
beams undergo self-smoothing, in which oscillations of the diffracted waves
gradually focus and smooth out at the critical distance
Parrot Captions Teach CLIP to Spot Text
Despite CLIP being the foundation model in numerous vision-language
applications, the CLIP suffers from a severe text spotting bias. Such bias
causes CLIP models to `Parrot' the visual text embedded within images while
disregarding the authentic visual semantics. We uncover that in the most
popular image-text dataset LAION-2B, the captions also densely parrot (spell)
the text embedded in images. Our analysis shows that around 50% of images are
embedded with visual text content, and around 30% of captions words are in
these embedded visual content. Based on such observation, we thoroughly inspect
the different released versions of CLIP models and verify that the visual text
is the dominant factor in measuring the LAION-style image-text similarity for
these models. To examine whether these parrot captions shape the text spotting
bias, we train a series of CLIP models with LAION subsets curated by different
parrot-caption-oriented criteria. We show that training with parrot captions
easily shapes such bias but harms the expected visual-language representation
learning in CLIP models. This suggests that it is urgent to revisit either the
design of CLIP-like models or the existing image-text dataset curation pipeline
built on CLIP score filtering.Comment: project page: https://linyq17.github.io/CLIP-Parrot-Bias/. Add more
analysis and ablation studies. Update Figure 3 with a more precise metri
SEPT: Towards Scalable and Efficient Visual Pre-Training
Recently, the self-supervised pre-training paradigm has shown great potential
in leveraging large-scale unlabeled data to improve downstream task
performance. However, increasing the scale of unlabeled pre-training data in
real-world scenarios requires prohibitive computational costs and faces the
challenge of uncurated samples. To address these issues, we build a
task-specific self-supervised pre-training framework from a data selection
perspective based on a simple hypothesis that pre-training on the unlabeled
samples with similar distribution to the target task can bring substantial
performance gains. Buttressed by the hypothesis, we propose the first yet novel
framework for Scalable and Efficient visual Pre-Training (SEPT) by introducing
a retrieval pipeline for data selection. SEPT first leverage a self-supervised
pre-trained model to extract the features of the entire unlabeled dataset for
retrieval pipeline initialization. Then, for a specific target task, SEPT
retrievals the most similar samples from the unlabeled dataset based on feature
similarity for each target instance for pre-training. Finally, SEPT pre-trains
the target model with the selected unlabeled samples in a self-supervised
manner for target data finetuning. By decoupling the scale of pre-training and
available upstream data for a target task, SEPT achieves high scalability of
the upstream dataset and high efficiency of pre-training, resulting in high
model architecture flexibility. Results on various downstream tasks demonstrate
that SEPT can achieve competitive or even better performance compared with
ImageNet pre-training while reducing the size of training samples by one
magnitude without resorting to any extra annotations.Comment: Accepted by AAAI 202
Structured air lasing of N2+
Structured light has attracted great interest in scientific and technical
fields. Here, we demonstrate the first generation of structured air lasing in
N2+ driven by 800 nm femtosecond laser pulses. By focusing a vortex pump beam
at 800 nm in N2 gas, we generate a vortex superfluorescent radiation of N2+ at
391 nm, which carries the same photon orbital angular momentum as the pump
beam. With the injection of a Gaussian seed beam at 391 nm, the coherent
radiation is amplified, but the vorticity is unchanged. A new physical
mechanism is revealed in the vortex N2+ superfluorescent radiation: the vortex
pump beam transfers the spatial spiral phase into the N2+ gain medium, and the
Gaussian seed beam picks up the spatial spiral phase and is then amplified into
a vortex beam. Moreover, when we employ a pump beam with a cylindrical vector
mode, the Gaussian seed beam is correspondingly amplified into a cylindrical
vector beam. Surprisingly, the spatial polarization state of the amplified
radiation is identical to that of the vector pump beam regardless of whether
the Gaussian seed beam is linearly, elliptically, or circularly polarized.
Solving three-dimensional coupled wave equations, we show how a Gaussian beam
becomes a cylindrical vector beam in a cylindrically symmetric gain medium.
This study provides a novel approach to generating structured light via N2+ air
lasing.Comment: 18 pages, 5 figures, 3 equation
- …