182 research outputs found

    Quasiphoton at the Subcycle Level in Strong-Field Ionization

    Full text link
    Photon is an energy quanta of light that does not exist at the sub-optical-cycle level. Exploiting the dynamical rotational symmetry of circularly or elliptically polarized light pulses, however, we demonstrate the existence of quasiphotons down to the subcycle level. We illustrate the concept of quasiphotons in strong-field ionization through the correlated spectrum of angular momentum and energy (SAME) of photoelectrons, both at the tunnel exit and in the asymptotic region. Moreover, we propose a protocol based on electron vortices to directly visualize the existence of quasiphotons. Our work paves the pathway towards a deeper understanding of fundamental light-matter interactions with photonic characteristics on the subcycle scale.Comment: 6 pages, 4 figure

    M2C: Towards Automatic Multimodal Manga Complement

    Full text link
    Multimodal manga analysis focuses on enhancing manga understanding with visual and textual features, which has attracted considerable attention from both natural language processing and computer vision communities. Currently, most comics are hand-drawn and prone to problems such as missing pages, text contamination, and aging, resulting in missing comic text content and seriously hindering human comprehension. In other words, the Multimodal Manga Complement (M2C) task has not been investigated, which aims to handle the aforementioned issues by providing a shared semantic space for vision and language understanding. To this end, we first propose the Multimodal Manga Complement task by establishing a new M2C benchmark dataset covering two languages. First, we design a manga argumentation method called MCoT to mine event knowledge in comics with large language models. Then, an effective baseline FVP-M2^{2} using fine-grained visual prompts is proposed to support manga complement. Extensive experimental results show the effectiveness of FVP-M2^{2} method for Multimodal Mange Complement.Comment: EMNLP2023. arXiv admin note: text overlap with arXiv:2210.1546

    Uneven-Layered Coding Metamaterial Tile for Ultrawideband RCS Reduction and Diffuse Scattering

    Get PDF
    In this paper, a novel uneven-layered coding metamaterial tile is proposed for ultra-wideband radar cross section (RCS) reduction and diffuse scattering. The metamaterial tile is composed of two kinds of square ring unit cells with different layer thickness. The reflection phase difference of 180° (±37°) between two unit cells covers an ultra-wide frequency range. Due to the phase cancellation between two unit cells, the metamaterial tile has the scattering pattern of four strong lobes deviating from normal direction. The metamaterial tile and its 90-degree rotation can be encoded as the ‘0’ and ‘1’ elements to cover an object, and diffuse scattering pattern can be realized by optimizing phase distribution, leading to reductions of the monostatic and bi-static RCSs simultaneously. The metamaterial tile can achieve −10 dB RCS reduction from 6.2 GHz to 25.7 GHz with the ratio bandwidth of 4.15:1 at normal incidence. The measured and simulated results are in good agreement and validate the proposed uneven-layered coding metamaterial tile can greatly expanding the bandwidth for RCS reduction and diffuse scattering

    Metasurface base on uneven layered fractal elements for ultra-wideband RCS reduction

    Get PDF
    A novel metasurface based on uneven layered fractal elements is designed and fabricated for ultra-wideband radar cross section (RCS) reduction in this paper. The proposed metasurface consists of two fractal subwavelength elements with different layer thickness. The reflection phase difference of 180◦ (±37◦) between two unit cells covers an ultra-wide frequency range. Ultra-wideband RCS reduction results from the phase cancellation between two local waves produced by these two unit cells. The diffuse scattering of electromagnetic (EM) waves is caused by the randomized phase distribution, leading to a low monostatic and bistatic RCS simultaneously. This metasurface can achieve -10dB RCS reduction in an ultra-wide frequency range from 6.6 to 23.9 GHz with a ratio bandwidth (fH/fL) of 3.62:1 under normal incidences for both x- and y-polarized waves. Both the simulation and the measurement results are consistent to verify this excellent RCS reduction performance of the proposed metasurface

    GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking

    Full text link
    Retrieval-enhanced text generation, which aims to leverage passages retrieved from a large passage corpus for delivering a proper answer given the input query, has shown remarkable progress on knowledge-intensive language tasks such as open-domain question answering and knowledge-enhanced dialogue generation. However, the retrieved passages are not ideal for guiding answer generation because of the discrepancy between retrieval and generation, i.e., the candidate passages are all treated equally during the retrieval procedure without considering their potential to generate the proper answers. This discrepancy makes a passage retriever deliver a sub-optimal collection of candidate passages to generate answers. In this paper, we propose the GeneRative Knowledge Improved Passage Ranking (GripRank) approach, addressing the above challenge by distilling knowledge from a generative passage estimator (GPE) to a passage ranker, where the GPE is a generative language model used to measure how likely the candidate passages can generate the proper answer. We realize the distillation procedure by teaching the passage ranker learning to rank the passages ordered by the GPE. Furthermore, we improve the distillation quality by devising a curriculum knowledge distillation mechanism, which allows the knowledge provided by the GPE can be progressively distilled to the ranker through an easy-to-hard curriculum, enabling the passage ranker to correctly recognize the provenance of the answer from many plausible candidates. We conduct extensive experiments on four datasets across three knowledge-intensive language tasks. Experimental results show advantages over the state-of-the-art methods for both passage ranking and answer generation on the KILT benchmark.Comment: 11 pages, 4 figure

    HanoiT: Enhancing Context-aware Translation via Selective Context

    Full text link
    Context-aware neural machine translation aims to use the document-level context to improve translation quality. However, not all words in the context are helpful. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. To mitigate this problem, we propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context. To verify the effectiveness of our method, extensive experiments and extra quantitative analysis are conducted on four document-level machine translation benchmarks. The experimental results demonstrate that our model significantly outperforms previous models on all datasets via the soft selection mechanism

    SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

    Full text link
    With the development of large language models, many remarkable linguistic systems like ChatGPT have thrived and achieved astonishing success on many tasks, showing the incredible power of foundation models. In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks. However, whether SAM can be adapted to 3D vision tasks has yet to be explored, especially 3D object detection. With this inspiration, we explore adapting the zero-shot ability of SAM to 3D object detection in this paper. We propose a SAM-powered BEV processing pipeline to detect objects and get promising results on the large-scale Waymo open dataset. As an early attempt, our method takes a step toward 3D object detection with vision foundation models and presents the opportunity to unleash their power on 3D vision tasks. The code is released at https://github.com/DYZhang09/SAM3D.Comment: Technical Report. The code is released at https://github.com/DYZhang09/SAM3

    m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt

    Full text link
    Multilingual translation supports multiple translation directions by projecting all languages in a shared space, but the translation quality is undermined by the difference between languages in the text-only modality, especially when the number of languages is large. To bridge this gap, we introduce visual context as the universal language-independent representation to facilitate multilingual translation. In this paper, we propose a framework to leverage the multimodal prompt to guide the Multimodal Multilingual neural Machine Translation (m3P), which aligns the representations of different languages with the same meaning and generates the conditional vision-language memory for translation. We construct a multilingual multimodal instruction dataset (InstrMulti102) to support 102 languages. Our method aims to minimize the representation distance of different languages by regarding the image as a central language. Experimental results show that m3P outperforms previous text-only baselines and multilingual multimodal methods by a large margin. Furthermore, the probing experiments validate the effectiveness of our method in enhancing translation under the low-resource and massively multilingual scenario.Comment: COLING 202

    MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

    Full text link
    Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages. Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation. In this paper, we propose an effective multi-stage tuning framework called MT4CrossIE, designed for enhancing cross-lingual open information extraction by injecting language-specific knowledge into the shared model. Specifically, the cross-lingual pre-trained model is first tuned in a shared semantic space (e.g., embedding matrix) in the fixed encoder and then other components are optimized in the second stage. After enough training, we freeze the pre-trained model and tune the multiple extra low-rank language-specific modules using mixture-of-LoRAs for model-based cross-lingual transfer. In addition, we leverage two-stage prompting to encourage the large language model (LLM) to annotate the multi-lingual raw data for data-based cross-lingual transfer. The model is trained with multi-lingual objectives on our proposed dataset OpenIE4++ by combing the model-based and data-based transfer techniques. Experimental results on various benchmarks emphasize the importance of aggregating multiple plug-in-and-play language-specific modules and demonstrate the effectiveness of MT4CrossIE in cross-lingual OIE\footnote{\url{https://github.com/CSJianYang/Multilingual-Multimodal-NLP}}.Comment: 10 page

    LogLG: Weakly Supervised Log Anomaly Detection via Log-Event Graph Construction

    Full text link
    Fully supervised log anomaly detection methods suffer the heavy burden of annotating massive unlabeled log data. Recently, many semi-supervised methods have been proposed to reduce annotation costs with the help of parsed templates. However, these methods consider each keyword independently, which disregards the correlation between keywords and the contextual relationships among log sequences. In this paper, we propose a novel weakly supervised log anomaly detection framework, named LogLG, to explore the semantic connections among keywords from sequences. Specifically, we design an end-to-end iterative process, where the keywords of unlabeled logs are first extracted to construct a log-event graph. Then, we build a subgraph annotator to generate pseudo labels for unlabeled log sequences. To ameliorate the annotation quality, we adopt a self-supervised task to pre-train a subgraph annotator. After that, a detection model is trained with the generated pseudo labels. Conditioned on the classification results, we re-extract the keywords from the log sequences and update the log-event graph for the next iteration. Experiments on five benchmarks validate the effectiveness of LogLG for detecting anomalies on unlabeled log data and demonstrate that LogLG, as the state-of-the-art weakly supervised method, achieves significant performance improvements compared to existing methods.Comment: 12 page
    • …
    corecore