Search CORE

120 research outputs found

Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks

Author: Chen Sizhou
Fang Sen
Gao Songyang
Publication venue
Publication date: 14/09/2023
Field of study

The Transformer architecture has proven to be highly effective for Automatic Speech Recognition (ASR) tasks, becoming a foundational component for a plethora of research in the domain. Historically, many approaches have leaned on fixed-length attention windows, which becomes problematic for varied speech samples in duration and complexity, leading to data over-smoothing and neglect of essential long-term connectivity. Addressing this limitation, we introduce Echo-MSA, a nimble module equipped with a variable-length attention mechanism that accommodates a range of speech sample complexities and durations. This module offers the flexibility to extract speech features across various granularities, spanning from frames and phonemes to words and discourse. The proposed design captures the variable length feature of speech and addresses the limitations of fixed-length attention. Our evaluation leverages a parallel attention architecture complemented by a dynamic gating mechanism that amalgamates traditional attention with the Echo-MSA module output. Empirical evidence from our study reveals that integrating Echo-MSA into the primary model's training regime significantly enhances the word error rate (WER) performance, all while preserving the intrinsic stability of the original model

arXiv.org e-Print Archive

Posuzování bonity klienta pomocí klasifikačních stromů

Author: Gao Songyang
Publication venue: Vysoká škola báňská - Technická univerzita Ostrava
Publication date: 01/01/2019
Field of study

The credit-granting process leads to a choice between two actions—give this new applicant credit or refuse this applicant credit. In the thesis, we use the CART algorithm with 10-fold cross-validation to build the model to be able predict creditworthiness of a new applicant.276/5000 Proces udělování kreditů vede k výběru mezi dvěma akcemi - dát tomuto novému žadatelskému kreditu nebo odmítnout tento žadatelský kredit. V práci používáme algoritmus CART s 10-násobnou křížovou validací k vytvoření modelu, který bude schopen předpovědět bonitu nového žadatele.154 - Katedra financívelmi dobř

DSpace at VSB Technical University of Ostrava

Informační efektivnost akciových trhů

Author: Gao Songyang
Publication venue: Vysoká škola báňská – Technická univerzita Ostrava
Publication date: 01/01/2021
Field of study

The efficient market hypothesis is one of the possible approaches when explaining the movements of asset prices in financial markets. In this thesis, the author deals with the hypothesis of whether the main indexes of selected Asian stock markets (mainland China, Hong Kong, Japan) follow the random walk hypothesis or not. The basic testing period was defined from 2010 to 2020 using the data of daily frequency. The predictability of returns in the form of random walk models is investigated with the help of selected statistical tests. The application framework is based on a combination of mathematical models of efficient markets with linear and nonlinear testing procedures. The random walk hypothesis is rejected most often in the case of the SSE50 index using the whole data sample. In addition, the dynamics of the results of the BDS independence test and variance ratio test in time are evaluated as well. To evaluate the significance of applied tests, the methods of bootstrapping are utilized. Based on dynamic results of statistical tests, the investment strategy proposal was discussed as well.Hypotéza efektivního trhu je jedním z možných přístupů k vysvětlení pohybu cen aktiv na finančních trzích. V této diplomové práci se autor zabývá hypotézou, zda hlavní indexy vybraných akciových trhů ve středoevropských zemích (pevninská Čína, Hong Kong, Japonsko) vykazují chování, jenž je v souladu s hypotézou náhodné procházky. Základní testovací období bylo definováno od roku 2010 do roku 2020. V diplomové práci jsou použity údaje o denní frekvenci. Předvídatelnost výnosů ve formě modelů náhodné procházky je zkoumána pomocí vybraných statistických testů. Aplikační rámec je založen na kombinaci matematických modelů efektivních trhů s lineárními a nelineárními testovacími postupy. Hypotéza náhodné procházky je nejčastěji zamítnuta v případě indexu SSE50 při užití celého datového vzorku. Hodnocena je rovněž dynamika výsledků v čase pomocí BDS testu nezávislosti a testu poměru rozptylů. K vyhodnocení významnosti těchto testů jsou použity metody bootstrappingu. Na základě dynamických výsledků statistických testů byl také předložen návrh investiční strategie.154 - Katedra financívýborn

DSpace at VSB Technical University of Ostrava

On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection

Author: Dou Shihan
Gao Songyang
Huang Xuanjing
Ma Jin
Shan Ying
Zhang Qi
Publication venue
Publication date: 26/06/2023
Field of study

Detecting adversarial samples that are carefully crafted to fool the model is a critical step to socially-secure applications. However, existing adversarial detection methods require access to sufficient training data, which brings noteworthy concerns regarding privacy leakage and generalizability. In this work, we validate that the adversarial sample generated by attack algorithms is strongly related to a specific vector in the high-dimensional inputs. Such vectors, namely UAPs (Universal Adversarial Perturbations), can be calculated without original training data. Based on this discovery, we propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs. Experimental results show that our method achieves competitive detection performance on various text classification tasks, and maintains an equivalent time consumption to normal inference.Comment: Accepted by ACL2023 (Short Paper

arXiv.org e-Print Archive

DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization

Author: Dou Shihan
Gao Songyang
Liu Yan
Ma Jin
Shan Ying
Wang Xiao
Wei Zhongyu
Zhang Qi
Publication venue
Publication date: 26/06/2023
Field of study

Adversarial training is one of the best-performing methods in improving the robustness of deep language models. However, robust models come at the cost of high time consumption, as they require multi-step gradient ascents or word substitutions to obtain adversarial samples. In addition, these generated samples are deficient in grammatical quality and semantic consistency, which impairs the effectiveness of adversarial training. To address these problems, we introduce a novel, effective procedure for instead adversarial training with only clean data. Our procedure, distribution shift risk minimization (DSRM), estimates the adversarial loss by perturbing the input data's probability distribution rather than their embeddings. This formulation results in a robust model that minimizes the expected global loss under adversarial attacks. Our approach requires zero adversarial samples for training and reduces time consumption by up to 70\% compared to current best-performing adversarial training methods. Experiments demonstrate that DSRM considerably improves BERT's resistance to textual adversarial attacks and achieves state-of-the-art robust accuracy on various benchmarks.Comment: Accepted by ACL202

arXiv.org e-Print Archive

TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

Author: Chen Tianze
Gao Songyang
Gui Tao
Huang Xuanjing
Jin Senjie
Wang Xiao
Xi Zhiheng
Yang Xianjun
Zhang Qi
Zhang Yuansen
Zheng Rui
Zou Yicheng
Publication venue
Publication date: 10/10/2023
Field of study

Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. However, the continual learning aspect of these aligned LLMs has been largely overlooked. Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs, owing to both their simplicity and the models' potential exposure during instruction tuning. In this paper, we introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs. TRACE consists of 8 distinct datasets spanning challenging tasks including domain-specific tasks, multilingual capabilities, code generation, and mathematical reasoning. All datasets are standardized into a unified format, allowing for effortless automatic evaluation of LLMs. Our experiments show that after training on TRACE, aligned LLMs exhibit significant declines in both general ability and instruction-following capabilities. For example, the accuracy of llama2-chat 13B on gsm8k dataset declined precipitously from 28.8\% to 2\% after training on our datasets. This highlights the challenge of finding a suitable tradeoff between achieving performance on specific tasks while preserving the original prowess of LLMs. Empirical findings suggest that tasks inherently equipped with reasoning paths contribute significantly to preserving certain capabilities of LLMs against potential declines. Motivated by this, we introduce the Reasoning-augmented Continual Learning (RCL) approach. RCL integrates task-specific cues with meta-rationales, effectively reducing catastrophic forgetting in LLMs while expediting convergence on novel tasks

arXiv.org e-Print Archive

ARA-net: an attention-aware retinal atrophy segmentation network coping with fundus images

Author: Hai Tan
Lei Chen
Manyu Li
Songyang Gao
Yuying Zhou
Zhijiang Wan
Zhijiang Wan
Zhijiang Wan
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2023
Field of study

BackgroundAccurately detecting and segmenting areas of retinal atrophy are paramount for early medical intervention in pathological myopia (PM). However, segmenting retinal atrophic areas based on a two-dimensional (2D) fundus image poses several challenges, such as blurred boundaries, irregular shapes, and size variation. To overcome these challenges, we have proposed an attention-aware retinal atrophy segmentation network (ARA-Net) to segment retinal atrophy areas from the 2D fundus image.MethodsIn particular, the ARA-Net adopts a similar strategy as UNet to perform the area segmentation. Skip self-attention connection (SSA) block, comprising a shortcut and a parallel polarized self-attention (PPSA) block, has been proposed to deal with the challenges of blurred boundaries and irregular shapes of the retinal atrophic region. Further, we have proposed a multi-scale feature flow (MSFF) to challenge the size variation. We have added the flow between the SSA connection blocks, allowing for capturing considerable semantic information to detect retinal atrophy in various area sizes.ResultsThe proposed method has been validated on the Pathological Myopia (PALM) dataset. Experimental results demonstrate that our method yields a high dice coefficient (DICE) of 84.26%, Jaccard index (JAC) of 72.80%, and F1-score of 84.57%, which outperforms other methods significantly.ConclusionOur results have demonstrated that ARA-Net is an effective and efficient approach for retinal atrophic area segmentation in PM

Directory of Open Access Journals

Binding to Na(+) /H(+) exchanger regulatory factor 2 (NHERF2) affects trafficking and function of the enteropathogenic Escherichia coli type III secretion system effectors Map, EspI and NleH.

Author: Alto
Bretscher
Brone
Bulgin
Chen
Collington
Crepin
Croxen
Donowitz
Fam
Gao
Garcia-Mata
Garmendia
Gruenheid
He
Hemrajani
Hemrajani
Huang
Iguchi
Kenny
Kenny
Kim
Knutton
Kubori
Lamprecht
Lee
Levine
Marches
McDaniel
Mills
Oh
Papatheodorou
Reczek
Roberts
Robinson
Schlosser-Silverman
Seidler
Shenolikar
Shenolikar
Simpson
Songyang
Sun
Thanabalasuriar
Theisen
Tonikian
Yun
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

Enteropathogenic Escherichia coli (EPEC) strains are diarrhoeal pathogens that use a type III secretion system to translocate effector proteins into host cells in order to colonize and multiply in the human gut. Map, EspI and NleH1 are conserved EPEC effectors that possess a C-terminal class I PSD-95/Disc Large/ZO-1 (PDZ)-binding motif. Using a PDZ array screen we identified Na(+)/H(+) exchanger regulatory factor 2 (NHERF2), a scaffold protein involved in tethering and recycling ion channels in polarized epithelia that contains two PDZ domains, as a common target of Map, EspI and NleH1. Using recombinant proteins and co-immunoprecipitation we confirmed that NHERF2 binds each of the effectors. We generated a HeLa cell line stably expressing HA-tagged NHERF2 and found that Map, EspI and NleH1 colocalize and interact with intracellular NHERF2 via their C-terminal PDZ-binding motif. Overexpression of NHERF2 enhanced the formation and persistence of Map-induced filopodia, accelerated the trafficking of EspI to the Golgi and diminished the anti-apoptotic activity of NleH1. The binding of multiple T3SS effectors to a single scaffold protein is unique. Our data suggest that NHERF2 may act as a plasma membrane sorting site, providing a novel regulatory mechanism to control the intracellular spatial and temporal effector protein activity

Crossref

LSHTM Research Online

Online Research @ Cardiff

HAL Descartes

PubMed Central

Spiral - Imperial College Digital Repository

University of Melbourne Institutional Repository

Hal-Diderot

Secrets of RLHF in Large Language Models Part I: PPO

Author: Chang Cheng
Chen Lu
Cheng Wensen
Dou Shihan
Gao Songyang
Gui Tao
Hua Yuan
Huang Haoran
Huang Xuanjing
Jin Senjie
Lai Wenbin
Liu Qin
Liu Yan
Qiu Xipeng
Shen Wei
Sun Tianxiang
Wang Binghai
Weng Rongxiang
Xi Zhiheng
Xiong Limao
Xu Nuo
Yan Hang
Yin Zhangyue
Zhang Qi
Zheng Rui
Zhou Yuhao
Zhu Minghao
Publication venue
Publication date: 10/07/2023
Field of study

Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current technical routes usually include \textbf{reward models} to measure human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize policy model outputs, and \textbf{process supervision} to improve step-by-step reasoning capabilities. However, due to the challenges of reward design, environment interaction, and agent training, coupled with huge trial and error cost of large language models, there is a significant barrier for AI researchers to motivate the development of technical alignment and safe landing of LLMs. The stable training of RLHF has still been a puzzle. In the first report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training. We identify policy constraints being the key factor for the effective implementation of the PPO algorithm. Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model. Based on our main results, we perform a comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT. The absence of open-source implementations has posed significant challenges to the investigation of LLMs alignment. Therefore, we are eager to release technical reports, reward models and PPO code

arXiv.org e-Print Archive

Fourier transform-ion cyclotron resonance mass spectrometric resolution, identification, and screening of non-covalent complexes of Hck Src homology 2 domain receptor and ligands from a 324-member peptide combinatorial library

Author: A. G. Marshall
A. G. Marshall
A. J. Link
A. Kramer
A. Shevchenko
C. A. Koch
E. Atherton
F. Sicheri
G. Payne
H. Hernandez
I. Moarefi
J. A. Loo
J. B. Fenn
J. Gao
J. P. Nawrocki
L. A. Carpino
M. C. Botfield
M. Karas
M. M. Bradford
M. W. Senko
M. W. Senko
M. Wigger
N. Quintrell
Q. Wu
R. D. Smith
R. D. Smith
R. J. Anderegg
R. Knorr
S. F. Ziegler
S. H. Guan
T. D. Wood
W. König
Z. Songyang
Z. Songyang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref