Search CORE

422 research outputs found

Efficient Data Learning for Open Information Extraction with Pre-trained Language Models

Author: Fan Zhiyuan
He Shizhu
Publication venue
Publication date: 23/10/2023
Field of study

Open Information Extraction (OpenIE) is a fundamental yet challenging task in Natural Language Processing, which involves extracting all triples (subject, predicate, object) from a given sentence. While labeling-based methods have their merits, generation-based techniques offer unique advantages, such as the ability to generate tokens not present in the original sentence. However, these generation-based methods often require a significant amount of training data to learn the task form of OpenIE and substantial training time to overcome slow model convergence due to the order penalty. In this paper, we introduce a novel framework, OK-IE, that ingeniously transforms the task form of OpenIE into the pre-training task form of the T5 model, thereby reducing the need for extensive training data. Furthermore, we introduce an innovative concept of Anchor to control the sequence of model outputs, effectively eliminating the impact of order penalty on model convergence and significantly reducing training time. Experimental results indicate that, compared to previous SOTA methods, OK-IE requires only 1/100 of the training data (900 instances) and 1/120 of the training time (3 minutes) to achieve comparable results

arXiv.org e-Print Archive

Efficient Algorithms for Sparse Moment Problems without Separation

Author: Fan Zhiyuan
Li Jian
Publication venue
Publication date: 23/07/2023
Field of study

We consider the sparse moment problem of learning a

k

-spike mixture in high-dimensional space from its noisy moment information in any dimension. We measure the accuracy of the learned mixtures using transportation distance. Previous algorithms either assume certain separation assumptions, use more recovery moments, or run in (super) exponential time. Our algorithm for the one-dimensional problem (also called the sparse Hausdorff moment problem) is a robust version of the classic Prony's method, and our contribution mainly lies in the analysis. We adopt a global and much tighter analysis than previous work (which analyzes the perturbation of the intermediate results of Prony's method). A useful technical ingredient is a connection between the linear system defined by the Vandermonde matrix and the Schur polynomial, which allows us to provide tight perturbation bound independent of the separation and may be useful in other contexts. To tackle the high-dimensional problem, we first solve the two-dimensional problem by extending the one-dimensional algorithm and analysis to complex numbers. Our algorithm for the high-dimensional case determines the coordinates of each spike by aligning a 1d projection of the mixture to a random vector and a set of 2d projections of the mixture. Our results have applications to learning topic models and Gaussian mixtures, implying improved sample complexity results or running time over prior work

arXiv.org e-Print Archive

UCF: Uncovering Common Features for Generalizable Deepfake Detection

Author: Fan Yanbo
Wu Baoyuan
Yan Zhiyuan
Zhang Yong
Publication venue
Publication date: 28/10/2023
Field of study

Deepfake detection remains a challenging task due to the difficulty of generalizing to new types of forgeries. This problem primarily stems from the overfitting of existing detection methods to forgery-irrelevant features and method-specific patterns. The latter has been rarely studied and not well addressed by previous works. This paper presents a novel approach to address the two types of overfitting issues by uncovering common forgery features. Specifically, we first propose a disentanglement framework that decomposes image information into three distinct components: forgery-irrelevant, method-specific forgery, and common forgery features. To ensure the decoupling of method-specific and common forgery features, a multi-task learning strategy is employed, including a multi-class classification that predicts the category of the forgery method and a binary classification that distinguishes the real from the fake. Additionally, a conditional decoder is designed to utilize forgery features as a condition along with forgery-irrelevant features to generate reconstructed images. Furthermore, a contrastive regularization technique is proposed to encourage the disentanglement of the common and specific forgery features. Ultimately, we only utilize the common forgery features for the purpose of generalizable deepfake detection. Extensive evaluations demonstrate that our framework can perform superior generalization than current state-of-the-art methods

arXiv.org e-Print Archive

Dielectric response of soft mode in ferroelectric SrTiO3

Author: Han Jiaguang
Wan Fan
Zhang Weili
Zhu Zhiyuan
Publication venue: 'AIP Publishing'
Publication date: 12/12/2006
Field of study

We report far-infrared dielectric properties of powder form ferroelectric SrTiO3. Terahertz time-domain spectroscopy (THz-TDS) measurement reveals that the low-frequency dielectric response of SrTiO3 is a consequence of the lowest transverse optical (TO) soft mode TO1 at 2.70 THz (90.0 1/cm), which is directly verified by Raman spectroscopy. This result provides a better understanding of the relation of low-frequency dielectric function with the optical phonon soft mode for ferroelectric materials. Combining THz-TDS with Raman spectra, the overall low-frequency optical phonon response of SrTiO3 is presented in an extended spectral range from 6.7 1/cm to 1000.0 1/cm.Comment: 14 pages; 4 figure

arXiv.org e-Print Archive

Crossref

SHAREOK repository