Search CORE

825 research outputs found

N-gram Language Model for Chinese Function-word-centered Patterns

Author: Liu Yixiao
Qu Yunhua
Song Jie
Publication venue: University of Zagreb, Faculty of Electrical Engineering and Computing
Publication date: 01/01/2023
Field of study

N-gram language modelling, a proven and effective method in NLP, is widely used to calculate the probability of a sentence in natural language. Language pattern is a linguistic level between word/character and sentence, which exists in pattern grammar. In this research, the approach of language model and language pattern are combined for the first time, and language patterns are studied by use of the N-gram model. Chinese function-word-centered patterns are extracted from the LCMC corpus, and aligned into pattern chains. The language model is trained from these chains to investigate the properties and distribution of Chinese function words, the interaction of content words and function words, and the interaction between patterns. The results indicate that there are approximately 10,000 function-word-centered patterns in the texts, which are distributed exponentially. This research summarizes the most common function-word-centered patterns and content-word-centered patterns, and discusses the interactions of patterns based on corpus data. The bigram language model of these patterns reflects the restrictions of function words. In addition, the research adopts an innovative method to visualize the interactions between patterns. This research fills the research gap between word/character and sentence and reveals basic Chinese pattern categories and the interactions between patterns, which makes a significant contribution to Chinese linguistic research, and improves the efficiency of NLP

HRČAK - Portal of Croatian Scientific and Professional Journals

Gender diversity in top management team and firm financial performance: empirical evidence from China

Author: Liu Yixiao
Publication venue
Publication date
Field of study

In the context of the rapid development of the global economy, a diversified top management team helps companies gain competitive advantage. Gender equality is also receiving increasing attention. In recent years, there is an increasing number of researches about gender diversity in top management team and firm financial performance. The findings of the relationship between the gender diversity of firm financial performance is mixed. Most of these studies have been conducted in Europe and America. There are few empirical studies in China on this topic. This dissertation examined the impact of female executives on firm financial performance. The fixed effects model was adopted in this dissertation to perform regression analysis on the three-year panel data of Chinese listed companies. There is a positive relationship in the result of this paper. The findings of this dissertation make contributions to the methods of the company to optimize the structure of top management team. Key words: gender diversity in top management team; firm financial performance; mediating and moderating factors; listed company in Chin

Nottingham ePrints

Launching a Robust Backdoor Attack under Capability Constrained Scenarios

Author: Ding Kangyi
Liu Xiaolei
Xu Yixiao
Yi Ming
Yin Mingyong
Publication venue
Publication date: 21/04/2023
Field of study

As deep neural networks continue to be used in critical domains, concerns over their security have emerged. Deep learning models are vulnerable to backdoor attacks due to the lack of transparency. A poisoned backdoor model may perform normally in routine environments, but exhibit malicious behavior when the input contains a trigger. Current research on backdoor attacks focuses on improving the stealthiness of triggers, and most approaches require strong attacker capabilities, such as knowledge of the model structure or control over the training process. These attacks are impractical since in most cases the attacker's capabilities are limited. Additionally, the issue of model robustness has not received adequate attention. For instance, model distillation is commonly used to streamline model size as the number of parameters grows exponentially, and most of previous backdoor attacks failed after model distillation; the image augmentation operations can destroy the trigger and thus disable the backdoor. This study explores the implementation of black-box backdoor attacks within capability constraints. An attacker can carry out such attacks by acting as either an image annotator or an image provider, without involvement in the training process or knowledge of the target model's structure. Through the design of a backdoor trigger, our attack remains effective after model distillation and image augmentation, making it more threatening and practical. Our experimental results demonstrate that our method achieves a high attack success rate in black-box scenarios and evades state-of-the-art backdoor defenses.Comment: 9 pages, 6 figure

arXiv.org e-Print Archive

Continual Learning for Abdominal Multi-Organ and Tumor Segmentation

Author: Chen Huimiao
Li Xinyi
Liu Yaoyao
Yuille Alan
Zhang Yixiao
Zhou Zongwei
Publication venue
Publication date: 21/07/2023
Field of study

The ability to dynamically extend a model to new data and classes is critical for multiple organ and tumor segmentation. However, due to privacy regulations, accessing previous data and annotations can be problematic in the medical domain. This poses a significant barrier to preserving the high segmentation accuracy of the old classes when learning from new classes because of the catastrophic forgetting problem. In this paper, we first empirically demonstrate that simply using high-quality pseudo labels can fairly mitigate this problem in the setting of organ segmentation. Furthermore, we put forward an innovative architecture designed specifically for continuous organ and tumor segmentation, which incurs minimal computational overhead. Our proposed design involves replacing the conventional output layer with a suite of lightweight, class-specific heads, thereby offering the flexibility to accommodate newly emerging classes. These heads enable independent predictions for newly introduced and previously learned classes, effectively minimizing the impact of new classes on old ones during the course of continual learning. We further propose incorporating Contrastive Language-Image Pretraining (CLIP) embeddings into the organ-specific heads. These embeddings encapsulate the semantic information of each class, informed by extensive image-text co-training. The proposed method is evaluated on both in-house and public abdominal CT datasets under organ and tumor segmentation tasks. Empirical results suggest that the proposed design improves the segmentation performance of a baseline neural network on newly-introduced and previously-learned classes along the learning trajectory.Comment: MICCAI-202

arXiv.org e-Print Archive

LeCaRDv2: A Large-Scale Chinese Legal Case Retrieval Dataset

Author: Ai Qingyao
Li Haitao
Liu Yiqun
Ma Yixiao
Shao Yunqiu
Wu Yueyue
Publication venue
Publication date: 26/10/2023
Field of study

As an important component of intelligent legal systems, legal case retrieval plays a critical role in ensuring judicial justice and fairness. However, the development of legal case retrieval technologies in the Chinese legal system is restricted by three problems in existing datasets: limited data size, narrow definitions of legal relevance, and naive candidate pooling strategies used in data sampling. To alleviate these issues, we introduce LeCaRDv2, a large-scale Legal Case Retrieval Dataset (version 2). It consists of 800 queries and 55,192 candidates extracted from 4.3 million criminal case documents. To the best of our knowledge, LeCaRDv2 is one of the largest Chinese legal case retrieval datasets, providing extensive coverage of criminal charges. Additionally, we enrich the existing relevance criteria by considering three key aspects: characterization, penalty, procedure. This comprehensive criteria enriches the dataset and may provides a more holistic perspective. Furthermore, we propose a two-level candidate set pooling strategy that effectively identify potential candidates for each query case. It's important to note that all cases in the dataset have been annotated by multiple legal experts specializing in criminal law. Their expertise ensures the accuracy and reliability of the annotations. We evaluate several state-of-the-art retrieval models at LeCaRDv2, demonstrating that there is still significant room for improvement in legal case retrieval. The details of LeCaRDv2 can be found at the anonymous website https://github.com/anonymous1113243/LeCaRDv2

arXiv.org e-Print Archive

Toward a density Corr\'{a}di--Hajnal theorem for degenerate hypergraphs

Author: Hou Jianfeng
Hu Caiyun
Li Heng
Liu Xizhi
Yang Caihong
Zhang Yixiao
Publication venue
Publication date: 25/11/2023
Field of study

Given an

r

-graph

F

with

r \ge 2

, let

\mathrm{ex}(n, (t+1) F)

denote the maximum number of edges in an

n

-vertex

r

-graph with at most

t

pairwise vertex-disjoint copies of

F

. Extending several old results and complementing prior work [J. Hou, H. Li, X. Liu, L.-T. Yuan, and Y. Zhang. A step towards a general density Corr\'{a}di--Hajnal theorem. arXiv:2302.09849, 2023.] on nondegenerate hypergraphs, we initiate a systematic study on

\mathrm{ex}(n, (t+1) F)

for degenerate hypergraphs

F

. For a broad class of degenerate hypergraphs

F

, we present near-optimal upper bounds for

\mathrm{ex}(n, (t+1) F)

when

n

is sufficiently large and

t

lies in intervals

\left[0, \frac{\varepsilon \cdot \mathrm{ex}(n,F)}{n^{r-1}}\right]

\left[\frac{\mathrm{ex}(n,F)}{\varepsilon n^{r-1}}, \varepsilon n \right]

, and

\left[ (1-\varepsilon)\frac{n}{v(F)}, \frac{n}{v(F)} \right]

, where

\varepsilon > 0

is a constant depending only on

F

. Our results reveal very different structures for extremal constructions across the three intervals, and we provide characterizations of extremal constructions within the first interval. Additionally, for graphs, we offer a characterization of extremal constructions within the second interval. Our proof for the first interval also applies to a special class of nondegenerate hypergraphs, including those with undetermined Tur\'{a}n densities, partially improving a result in [J. Hou, H. Li, X. Liu, L.-T. Yuan, and Y. Zhang. A step towards a general density Corr\'{a}di--Hajnal theorem. arXiv:2302.09849, 2023.]Comment: 37 pages, 4 figures, comments are welcom

arXiv.org e-Print Archive

Highly efficient influenza virus production: A MDCK-based high-cell-density process

Author: Bissinger Thomas
Genzel Yvonne
Liu Xuping
Reichl Udo
Tan Wen-Song
Wu Yixiao
Publication venue: ECI Digital Archives
Publication date: 01/01/2018
Field of study

Seasonal vaccination campaigns for influenza in developed and developing countries create a massive demand for 500 million (2015) vaccine doses every year [1]. Besides egg-based vaccine manufacturing, production platforms based on animal cell culture increasingly contribute to this overall growing market. In order to intensify cell culture-based influenza virus production, high-cell-density (HCD) cultivation of suspension cells can be applied to improve virus titer, process productivity and production costs [2]. For process optimization and evaluation of HCD conditions, cells cultivated using semi-perfusion approaches in small shakers can be used as a scale-down model for bioreactors operating in full perfusion mode [3]. In this study, a previously developed MDCK suspension cell line [4] was adapted to a new serum free medium [5] to facilitate higher growth rate, cell density and virus titer both in batch and in HCD. Therefore, MDCK cells cultivated in Smif-8 medium were slowly adapted to a new cultivation medium (Xeno™) by stepwise increasing the Xeno content. Fully adapted cells were cultivated in shaker flasks to evaluate the performance of influenza A virus production in batch and HCD. Cell densities exceeding 2∙107 cells/mL were achieved in shakers using semi-perfusion, where cell free medium was manually replaced with fresh medium. Volume and time interval of media replacement were chosen to achieve a constant cell-specific perfusion rate of 2.5 pL/(cell h). Cell cultures were infected with influenza virus (A/PR/8/34 H1N1 RKI) with trypsin addition. Cell count, viability, main metabolites and virus titer (HA-assay & TCID50) were monitored pre and post infection. Medium adaptation resulted in a MDCK suspension cell line with morphological, growth, and metabolic characteristics different from parental cells. Cells fully adapted to Xeno medium were growing to higher cell densities (1.4∙107 vs 6∙106 cells/mL) with higher specific growth rate (µmax: 0.036 vs 0.026 1/h), cells were bigger (15-16 vs 13-14 µm) and grew without aggregate formation. Due to higher cell densities at time of infection, virus titers up to 3.6 log10(HAU/100µL) were reached. In semi-perfusion, adapted MDCK cells were grown up to 6∙107 cells/mL, however, maximum virus titer and productivity were observed with 4∙107 cells/mL. In multiple harvests, very high virus titer exceeding 4 log10(HAU/100µL) and up to 9∙109 virions/mL (TCID50) were measured, which corresponded to an accumulated titer of 4.5 log10(HAU/100µL). Cell-specific virus titer was similar or higher compared to the reference batch infections, depending on perfusion and infection strategy. Overall, results in this semi-perfusion scale-down model for influenza A virus production suggest a highly efficient and productive upstream process for influenza virus production, with an up to six-fold improved space time yield compared to batch mode. [1] Palache A. et al., Vaccine 35 (2017): 4681–4686. doi: 10.1016/j.vaccine.2017.07.053 [2] Genzel Y. et al., Vaccine 32 (2014): 2770–2781. doi: 10.1016/j.vaccine.2014.02.016 [3] Vázquez-Ramírez D. et al., Vaccine (2018): article in press. doi: 10.1016/j.vaccine.2017.10.112 [4] Lohr V. et al., Vaccine 28 (2010): 6256–6264. doi: 10.1016/j.vaccine.2010.07.004 [5] Xeno™-S001S MDCK Cell Serum Free Medium (#FG0100402), Bioengine, Shanghai, Chin

Engineering Conferences International

MPG.PuRe