Search CORE

259 research outputs found

Towards the Transferable Audio Adversarial Attack via Ensemble Methods

Author: Chen Yuxuan
Guo Feng
Ju Lei
Sun Zheng
Publication venue
Publication date: 18/04/2023
Field of study

In recent years, deep learning (DL) models have achieved significant progress in many domains, such as autonomous driving, facial recognition, and speech recognition. However, the vulnerability of deep learning models to adversarial attacks has raised serious concerns in the community because of their insufficient robustness and generalization. Also, transferable attacks have become a prominent method for black-box attacks. In this work, we explore the potential factors that impact adversarial examples (AEs) transferability in DL-based speech recognition. We also discuss the vulnerability of different DL systems and the irregular nature of decision boundaries. Our results show a remarkable difference in the transferability of AEs between speech and images, with the data relevance being low in images but opposite in speech recognition. Motivated by dropout-based ensemble approaches, we propose random gradient ensembles and dynamic gradient-weighted ensembles, and we evaluate the impact of ensembles on the transferability of AEs. The results show that the AEs created by both approaches are valid for transfer to the black box API.Comment: Submitted to Cybersecurity journal 202

arXiv.org e-Print Archive

Concepción, guion literario y desarrollo narrativo del cortometraje de ficción “Vecinos extraños”.

Author: Feng Yuxuan
Qin Xuyang
Publication venue
Publication date: 01/01/2021
Field of study

Este trabajo presenta la concepción, el guion literario y el desarrollo narrativo del cortometraje de ficción audiovisual y de creación propia “Vecinos extraños”. A lo largo de la memoria de TFM el lector puede consultar el método seguido para la elaboración del cortometraje (incluyendo la formación del equipo de rodaje y los materiales de filmación), así como un desglose minucioso de las fases de preproducción, rodaje y postproducción. Posteriormente, detallamos la experiencia personal que ha supuesto para cada uno de los autores (Feng Yuxuan y Qin Xuyang) la elaboración de este TFM como punto culminante de un año inmersos en el Máster de Cine, Comunicación e Industria Audiovisual de la Universidad de Valladolid. Por último, el lector puede consultar el guion literario íntegro de “Vecinos extraños” en formato profesional de guion y, por supuesto, visionar el cortometraje a través del enlace facilitado para ello.Departamento de Historia Moderna, Contemporánea y de América, Periodismo y Comunicación Audiovisual y PublicidadMáster en Cine, comunicación e Industria Audiovisua

Repositorio Documental de la Universidad de Valladolid

LLaMA Rider: Spurring Large Language Models to Explore the Open World

Author: Feng Yicheng
Liu Jiazheng
Lu Zongqing
Wang Yuxuan
Zheng Sipeng
Publication venue
Publication date: 13/10/2023
Field of study

Recently, various studies have leveraged Large Language Models (LLMs) to help decision-making and planning in environments, and try to align the LLMs' knowledge with the world conditions. Nonetheless, the capacity of LLMs to continuously acquire environmental knowledge and adapt in an open world remains uncertain. In this paper, we propose an approach to spur LLMs to explore the open world, gather experiences, and learn to improve their task-solving capabilities. In this approach, a multi-round feedback-revision mechanism is utilized to encourage LLMs to actively select appropriate revision actions guided by feedback information from the environment. This facilitates exploration and enhances the model's performance. Besides, we integrate sub-task relabeling to assist LLMs in maintaining consistency in sub-task planning and help the model learn the combinatorial nature between tasks, enabling it to complete a wider range of tasks through training based on the acquired exploration experiences. By evaluation in Minecraft, an open-ended sandbox world, we demonstrate that our approach LLaMA-Rider enhances the efficiency of the LLM in exploring the environment, and effectively improves the LLM's ability to accomplish more tasks through fine-tuning with merely 1.3k instances of collected data, showing minimal training costs compared to the baseline using reinforcement learning.Comment: 18 page

arXiv.org e-Print Archive

Integrating Relation Constraints with Neural Relation Extractors

Author: Feng Yansong
Lai Yuxuan
Luo Bingfeng
Ye Yuan
Zhao Dongyan
Publication venue
Publication date: 26/11/2019
Field of study

Recent years have seen rapid progress in identifying predefined relationship between entity pairs using neural networks NNs. However, such models often make predictions for each entity pair individually, thus often fail to solve the inconsistency among different predictions, which can be characterized by discrete relation constraints. These constraints are often defined over combinations of entity-relation-entity triples, since there often lack of explicitly well-defined type and cardinality requirements for the relations. In this paper, we propose a unified framework to integrate relation constraints with NNs by introducing a new loss term, ConstraintLoss. Particularly, we develop two efficient methods to capture how well the local predictions from multiple instance pairs satisfy the relation constraints. Experiments on both English and Chinese datasets show that our approach can help NNs learn from discrete relation constraints to reduce inconsistency among local predictions, and outperform popular neural relation extraction NRE models even enhanced with extra post-processing. Our source code and datasets will be released at https://github.com/PKUYeYuan/Constraint-Loss-AAAI-2020.Comment: Accepted to AAAI-202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

How Many Answers Should I Give? An Empirical Study of Multi-Answer Reading Comprehension

Author: Feng Yansong
Lai Yuxuan
Lin Jiuheng
Liu Xiao
Zhang Chen
Zhao Dongyan
Publication venue
Publication date: 01/06/2023
Field of study

The multi-answer phenomenon, where a question may have multiple answers scattered in the document, can be well handled by humans but is challenging enough for machine reading comprehension (MRC) systems. Despite recent progress in multi-answer MRC, there lacks a systematic analysis of how this phenomenon arises and how to better address it. In this work, we design a taxonomy to categorize commonly-seen multi-answer MRC instances, with which we inspect three multi-answer datasets and analyze where the multi-answer challenge comes from. We further analyze how well different paradigms of current multi-answer MRC models deal with different types of multi-answer instances. We find that some paradigms capture well the key information in the questions while others better model the relationship between questions and contexts. We thus explore strategies to make the best of the strengths of different paradigms. Experiments show that generation models can be a promising platform to incorporate different paradigms. Our annotations and code are released for further research.Comment: Findings of ACL 202

arXiv.org e-Print Archive

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Author: Feng Haiwen
Feng Yao
Liu Weiyang
Liu Zhen
Qiu Zeju
Schölkopf Bernhard
Weller Adrian
Xue Yuxuan
Zhang Dan
Publication venue
Publication date: 26/10/2023
Field of study

Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important open problem. To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to downstream tasks. Unlike existing methods, OFT can provably preserve hyperspherical energy which characterizes the pairwise neuron relationship on the unit hypersphere. We find that this property is crucial for preserving the semantic generation ability of text-to-image diffusion models. To improve finetuning stability, we further propose Constrained Orthogonal Finetuning (COFT) which imposes an additional radius constraint to the hypersphere. Specifically, we consider two important finetuning text-to-image tasks: subject-driven generation where the goal is to generate subject-specific images given a few images of a subject and a text prompt, and controllable generation where the goal is to enable the model to take in additional control signals. We empirically show that our OFT framework outperforms existing methods in generation quality and convergence speed.Comment: NeurIPS 2023 (43 pages, 34 figures, project page: https://oft.wyliu.com/

arXiv.org e-Print Archive

Metasurface-Based Free-Space Multi-port Beam Splitter with Arbitrary Power Ratio

Author: Cui Kaiyu
Feng Xue
Huang Yidong
Liao Yuxuan
Liu Fang
Tian Tian
Zhang Wei
Publication venue
Publication date: 22/03/2023
Field of study

A beam splitter (BS) is one of the most critical building blocks in optical systems. Despite various attempts of flat-type BSs to miniaturize the conventional cube BS reported, it remains a challenge to realize an ultrathin optical BS with multi-port output, non-uniform splitting ratio and steerable outgoing directions. Herein, we have demonstrated a free-space optical multi-port beam splitter (MPBS) based on a polarization-independent all-dielectric metasurface. By applying an optimized phase-pattern paradigm via a gradient-descent-based iterative algorithm to amorphous silicon (a-Si) metasurfaces, we have prepared a variety of MPBS samples with arbitrarily predetermined output port number (2~7), power ratio and spatial distribution of output beams. The experimental results reveal that the fabricated MPBSs could achieve high total splitting efficiency (TSE, above 74.7%) and beam-splitting fidelity (similarity, above 78.4%) within the bandwidth of 100 nm (1500~1600 nm). We envision that such MPBS could provide fabulous flexibility for optical integrated system and diverse applications

arXiv.org e-Print Archive