497 research outputs found
Valley vortex states and degeneracy lifting via photonic higher-band excitation
We demonstrate valley-dependent vortex generation in a photonic graphene.
Without breaking the inversion symmetry, excitation of two equivalent valleys
leads to formation of an optical vortex upon Bragg-reflection to the third
valley, with its chirality determined by the valley degree of freedom.
Vortex-antivortex pairs with valley-dependent topological charge flipping are
also observed and corroborated by numerical simulations. Furthermore, we
develop a three-band effective Hamiltonian model to describe the dynamics of
the coupled valleys, and find that the commonly used two-band model is not
sufficient to explain the observed vortex degeneracy lifting. Such
valley-polarized vortex states arise from high-band excitation without
inversion symmetry breaking or synthetic-field-induced gap opening. Our results
from a photonic setting may provide insight for the study of valley contrasting
and Berry-phase mediated topological phenomena in other systems
YOLO-FaceV2: A Scale and Occlusion Aware Face Detector
In recent years, face detection algorithms based on deep learning have made
great progress. These algorithms can be generally divided into two categories,
i.e. two-stage detector like Faster R-CNN and one-stage detector like YOLO.
Because of the better balance between accuracy and speed, one-stage detectors
have been widely used in many applications. In this paper, we propose a
real-time face detector based on the one-stage detector YOLOv5, named
YOLO-FaceV2. We design a Receptive Field Enhancement module called RFE to
enhance receptive field of small face, and use NWD Loss to make up for the
sensitivity of IoU to the location deviation of tiny objects. For face
occlusion, we present an attention module named SEAM and introduce Repulsion
Loss to solve it. Moreover, we use a weight function Slide to solve the
imbalance between easy and hard samples and use the information of the
effective receptive field to design the anchor. The experimental results on
WiderFace dataset show that our face detector outperforms YOLO and its variants
can be find in all easy, medium and hard subsets. Source code in
https://github.com/Krasjet-Yu/YOLO-FaceV
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Large Language Models (LLMs) excel in NLP, but their demands hinder their
widespread deployment. While Quantization-Aware Training (QAT) offers a
solution, its extensive training costs make Post-Training Quantization (PTQ) a
more practical approach for LLMs. In existing studies, activation outliers in
particular channels are identified as the bottleneck to PTQ accuracy. They
propose to transform the magnitudes from activations to weights, which however
offers limited alleviation or suffers from unstable gradients, resulting in a
severe performance drop at low-bitwidth. In this paper, we propose QLLM, an
accurate and efficient low-bitwidth PTQ method designed for LLMs. QLLM
introduces an adaptive channel reassembly technique that reallocates the
magnitude of outliers to other channels, thereby mitigating their impact on the
quantization range. This is achieved by channel disassembly and channel
assembly, which first breaks down the outlier channels into several
sub-channels to ensure a more balanced distribution of activation magnitudes.
Then similar channels are merged to maintain the original channel number for
efficiency. Additionally, an adaptive strategy is designed to autonomously
determine the optimal number of sub-channels for channel disassembly. To
further compensate for the performance loss caused by quantization, we propose
an efficient tuning method that only learns a small number of low-rank weights
while freezing the pre-trained quantized model. After training, these low-rank
parameters can be fused into the frozen weights without affecting inference.
Extensive experiments on LLaMA-1 and LLaMA-2 show that QLLM can obtain accurate
quantized models efficiently. For example, QLLM quantizes the 4-bit LLaMA-2-70B
within 10 hours on a single A100-80G GPU, outperforming the previous
state-of-the-art method by 7.89% on the average accuracy across five zero-shot
tasks.Comment: ICLR 2024 camera ready; Code is available at
https://github.com/ziplab/QLLM and https://github.com/ModelTC/QLL
From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News
In the digital era, the rapid propagation of fake news and rumors via social
networks brings notable societal challenges and impacts public opinion
regulation. Traditional fake news modeling typically forecasts the general
popularity trends of different groups or numerically represents opinions shift.
However, these methods often oversimplify real-world complexities and overlook
the rich semantic information of news text. The advent of large language models
(LLMs) provides the possibility of modeling subtle dynamics of opinion.
Consequently, in this work, we introduce a Fake news Propagation Simulation
framework (FPS) based on LLM, which studies the trends and control of fake news
propagation in detail. Specifically, each agent in the simulation represents an
individual with a distinct personality. They are equipped with both short-term
and long-term memory, as well as a reflective mechanism to mimic human-like
thinking. Every day, they engage in random opinion exchanges, reflect on their
thinking, and update their opinions. Our simulation results uncover patterns in
fake news propagation related to topic relevance, and individual traits,
aligning with real-world observations. Additionally, we evaluate various
intervention strategies and demonstrate that early and appropriately frequent
interventions strike a balance between governance cost and effectiveness,
offering valuable insights for practical applications. Our study underscores
the significant utility and potential of LLMs in combating fake news
Unconventional Flatband Line States in Photonic Lieb Lattices
Flatband systems typically host "compact localized states"(CLS) due to
destructive interference and macroscopic degeneracy of Bloch wave functions
associated with a dispersionless energy band. Using a photonic Lieb
lattice(LL), we show that conventional localized flatband states are inherently
incomplete, with the missing modes manifested as extended line states which
form non-contractible loops winding around the entire lattice. Experimentally,
we develop a continuous-wave laser writing technique to establish a
finite-sized photonic LL with specially-tailored boundaries, thereby directly
observe the unusually extended flatband line states.Such unconventional line
states cannot be expressed as a linear combination of the previously observed
CLS but rather arise from the nontrivial real-space topology.The robustness of
the line states to imperfect excitation conditions is discussed, and their
potential applications are illustrated
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Quantization of transformer language models faces significant challenges due
to the existence of detrimental outliers in activations. We observe that these
outliers are asymmetric and concentrated in specific channels. To address this
issue, we propose the Outlier Suppression+ framework. First, we introduce
channel-wise shifting and scaling operations to eliminate asymmetric
presentation and scale down problematic channels. We demonstrate that these
operations can be seamlessly migrated into subsequent modules while maintaining
equivalence. Second, we quantitatively analyze the optimal values for shifting
and scaling, taking into account both the asymmetric property and quantization
errors of weights in the next layer. Our lightweight framework can incur
minimal performance degradation under static and standard post-training
quantization settings. Comprehensive results across various tasks and models
reveal that our approach achieves near-floating-point performance on both small
models, such as BERT, and large language models (LLMs) including OPTs, BLOOM,
and BLOOMZ at 8-bit and 6-bit settings. Furthermore, we establish a new state
of the art for 4-bit BERT
- …