497 research outputs found

    Valley vortex states and degeneracy lifting via photonic higher-band excitation

    Get PDF
    We demonstrate valley-dependent vortex generation in a photonic graphene. Without breaking the inversion symmetry, excitation of two equivalent valleys leads to formation of an optical vortex upon Bragg-reflection to the third valley, with its chirality determined by the valley degree of freedom. Vortex-antivortex pairs with valley-dependent topological charge flipping are also observed and corroborated by numerical simulations. Furthermore, we develop a three-band effective Hamiltonian model to describe the dynamics of the coupled valleys, and find that the commonly used two-band model is not sufficient to explain the observed vortex degeneracy lifting. Such valley-polarized vortex states arise from high-band excitation without inversion symmetry breaking or synthetic-field-induced gap opening. Our results from a photonic setting may provide insight for the study of valley contrasting and Berry-phase mediated topological phenomena in other systems

    YOLO-FaceV2: A Scale and Occlusion Aware Face Detector

    Full text link
    In recent years, face detection algorithms based on deep learning have made great progress. These algorithms can be generally divided into two categories, i.e. two-stage detector like Faster R-CNN and one-stage detector like YOLO. Because of the better balance between accuracy and speed, one-stage detectors have been widely used in many applications. In this paper, we propose a real-time face detector based on the one-stage detector YOLOv5, named YOLO-FaceV2. We design a Receptive Field Enhancement module called RFE to enhance receptive field of small face, and use NWD Loss to make up for the sensitivity of IoU to the location deviation of tiny objects. For face occlusion, we present an attention module named SEAM and introduce Repulsion Loss to solve it. Moreover, we use a weight function Slide to solve the imbalance between easy and hard samples and use the information of the effective receptive field to design the anchor. The experimental results on WiderFace dataset show that our face detector outperforms YOLO and its variants can be find in all easy, medium and hard subsets. Source code in https://github.com/Krasjet-Yu/YOLO-FaceV

    QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

    Full text link
    Large Language Models (LLMs) excel in NLP, but their demands hinder their widespread deployment. While Quantization-Aware Training (QAT) offers a solution, its extensive training costs make Post-Training Quantization (PTQ) a more practical approach for LLMs. In existing studies, activation outliers in particular channels are identified as the bottleneck to PTQ accuracy. They propose to transform the magnitudes from activations to weights, which however offers limited alleviation or suffers from unstable gradients, resulting in a severe performance drop at low-bitwidth. In this paper, we propose QLLM, an accurate and efficient low-bitwidth PTQ method designed for LLMs. QLLM introduces an adaptive channel reassembly technique that reallocates the magnitude of outliers to other channels, thereby mitigating their impact on the quantization range. This is achieved by channel disassembly and channel assembly, which first breaks down the outlier channels into several sub-channels to ensure a more balanced distribution of activation magnitudes. Then similar channels are merged to maintain the original channel number for efficiency. Additionally, an adaptive strategy is designed to autonomously determine the optimal number of sub-channels for channel disassembly. To further compensate for the performance loss caused by quantization, we propose an efficient tuning method that only learns a small number of low-rank weights while freezing the pre-trained quantized model. After training, these low-rank parameters can be fused into the frozen weights without affecting inference. Extensive experiments on LLaMA-1 and LLaMA-2 show that QLLM can obtain accurate quantized models efficiently. For example, QLLM quantizes the 4-bit LLaMA-2-70B within 10 hours on a single A100-80G GPU, outperforming the previous state-of-the-art method by 7.89% on the average accuracy across five zero-shot tasks.Comment: ICLR 2024 camera ready; Code is available at https://github.com/ziplab/QLLM and https://github.com/ModelTC/QLL

    From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News

    Full text link
    In the digital era, the rapid propagation of fake news and rumors via social networks brings notable societal challenges and impacts public opinion regulation. Traditional fake news modeling typically forecasts the general popularity trends of different groups or numerically represents opinions shift. However, these methods often oversimplify real-world complexities and overlook the rich semantic information of news text. The advent of large language models (LLMs) provides the possibility of modeling subtle dynamics of opinion. Consequently, in this work, we introduce a Fake news Propagation Simulation framework (FPS) based on LLM, which studies the trends and control of fake news propagation in detail. Specifically, each agent in the simulation represents an individual with a distinct personality. They are equipped with both short-term and long-term memory, as well as a reflective mechanism to mimic human-like thinking. Every day, they engage in random opinion exchanges, reflect on their thinking, and update their opinions. Our simulation results uncover patterns in fake news propagation related to topic relevance, and individual traits, aligning with real-world observations. Additionally, we evaluate various intervention strategies and demonstrate that early and appropriately frequent interventions strike a balance between governance cost and effectiveness, offering valuable insights for practical applications. Our study underscores the significant utility and potential of LLMs in combating fake news

    Unconventional Flatband Line States in Photonic Lieb Lattices

    Get PDF
    Flatband systems typically host "compact localized states"(CLS) due to destructive interference and macroscopic degeneracy of Bloch wave functions associated with a dispersionless energy band. Using a photonic Lieb lattice(LL), we show that conventional localized flatband states are inherently incomplete, with the missing modes manifested as extended line states which form non-contractible loops winding around the entire lattice. Experimentally, we develop a continuous-wave laser writing technique to establish a finite-sized photonic LL with specially-tailored boundaries, thereby directly observe the unusually extended flatband line states.Such unconventional line states cannot be expressed as a linear combination of the previously observed CLS but rather arise from the nontrivial real-space topology.The robustness of the line states to imperfect excitation conditions is discussed, and their potential applications are illustrated

    Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

    Full text link
    Quantization of transformer language models faces significant challenges due to the existence of detrimental outliers in activations. We observe that these outliers are asymmetric and concentrated in specific channels. To address this issue, we propose the Outlier Suppression+ framework. First, we introduce channel-wise shifting and scaling operations to eliminate asymmetric presentation and scale down problematic channels. We demonstrate that these operations can be seamlessly migrated into subsequent modules while maintaining equivalence. Second, we quantitatively analyze the optimal values for shifting and scaling, taking into account both the asymmetric property and quantization errors of weights in the next layer. Our lightweight framework can incur minimal performance degradation under static and standard post-training quantization settings. Comprehensive results across various tasks and models reveal that our approach achieves near-floating-point performance on both small models, such as BERT, and large language models (LLMs) including OPTs, BLOOM, and BLOOMZ at 8-bit and 6-bit settings. Furthermore, we establish a new state of the art for 4-bit BERT
    corecore