825 research outputs found
N-gram Language Model for Chinese Function-word-centered Patterns
N-gram language modelling, a proven and effective method in NLP, is widely used to calculate the probability of a sentence in natural language. Language pattern is a linguistic level between word/character and sentence, which exists in pattern grammar. In this research, the approach of language model and language pattern are combined for the first time, and language patterns are studied by use of the N-gram model. Chinese function-word-centered patterns are extracted from the LCMC corpus, and aligned into pattern chains. The language model is trained from these chains to investigate the properties and distribution of Chinese function words, the interaction of content words and function words, and the interaction between patterns. The results indicate that there are approximately 10,000 function-word-centered patterns in the texts, which are distributed exponentially. This research summarizes the most common function-word-centered patterns and content-word-centered patterns, and discusses the interactions of patterns based on corpus data. The bigram language model of these patterns reflects the restrictions of function words. In addition, the research adopts an innovative method to visualize the interactions between patterns. This research fills the research gap between word/character and sentence and reveals basic Chinese pattern categories and the interactions between patterns, which makes a significant contribution to Chinese linguistic research, and improves the efficiency of NLP
Gender diversity in top management team and firm financial performance: empirical evidence from China
In the context of the rapid development of the global economy, a diversified top management team helps companies gain competitive advantage. Gender equality is also receiving increasing attention. In recent years, there is an increasing number of researches about gender diversity in top management team and firm financial performance. The findings of the relationship between the gender diversity of firm financial performance is mixed. Most of these studies have been conducted in Europe and America. There are few empirical studies in China on this topic. This dissertation examined the impact of female executives on firm financial performance. The fixed effects model was adopted in this dissertation to perform regression analysis on the three-year panel data of Chinese listed companies. There is a positive relationship in the result of this paper. The findings of this dissertation make contributions to the methods of the company to optimize the structure of top management team.
Key words: gender diversity in top management team; firm financial performance; mediating and moderating factors; listed company in Chin
Launching a Robust Backdoor Attack under Capability Constrained Scenarios
As deep neural networks continue to be used in critical domains, concerns
over their security have emerged. Deep learning models are vulnerable to
backdoor attacks due to the lack of transparency. A poisoned backdoor model may
perform normally in routine environments, but exhibit malicious behavior when
the input contains a trigger. Current research on backdoor attacks focuses on
improving the stealthiness of triggers, and most approaches require strong
attacker capabilities, such as knowledge of the model structure or control over
the training process. These attacks are impractical since in most cases the
attacker's capabilities are limited. Additionally, the issue of model
robustness has not received adequate attention. For instance, model
distillation is commonly used to streamline model size as the number of
parameters grows exponentially, and most of previous backdoor attacks failed
after model distillation; the image augmentation operations can destroy the
trigger and thus disable the backdoor. This study explores the implementation
of black-box backdoor attacks within capability constraints. An attacker can
carry out such attacks by acting as either an image annotator or an image
provider, without involvement in the training process or knowledge of the
target model's structure. Through the design of a backdoor trigger, our attack
remains effective after model distillation and image augmentation, making it
more threatening and practical. Our experimental results demonstrate that our
method achieves a high attack success rate in black-box scenarios and evades
state-of-the-art backdoor defenses.Comment: 9 pages, 6 figure
Continual Learning for Abdominal Multi-Organ and Tumor Segmentation
The ability to dynamically extend a model to new data and classes is critical
for multiple organ and tumor segmentation. However, due to privacy regulations,
accessing previous data and annotations can be problematic in the medical
domain. This poses a significant barrier to preserving the high segmentation
accuracy of the old classes when learning from new classes because of the
catastrophic forgetting problem. In this paper, we first empirically
demonstrate that simply using high-quality pseudo labels can fairly mitigate
this problem in the setting of organ segmentation. Furthermore, we put forward
an innovative architecture designed specifically for continuous organ and tumor
segmentation, which incurs minimal computational overhead. Our proposed design
involves replacing the conventional output layer with a suite of lightweight,
class-specific heads, thereby offering the flexibility to accommodate newly
emerging classes. These heads enable independent predictions for newly
introduced and previously learned classes, effectively minimizing the impact of
new classes on old ones during the course of continual learning. We further
propose incorporating Contrastive Language-Image Pretraining (CLIP) embeddings
into the organ-specific heads. These embeddings encapsulate the semantic
information of each class, informed by extensive image-text co-training. The
proposed method is evaluated on both in-house and public abdominal CT datasets
under organ and tumor segmentation tasks. Empirical results suggest that the
proposed design improves the segmentation performance of a baseline neural
network on newly-introduced and previously-learned classes along the learning
trajectory.Comment: MICCAI-202
LeCaRDv2: A Large-Scale Chinese Legal Case Retrieval Dataset
As an important component of intelligent legal systems, legal case retrieval
plays a critical role in ensuring judicial justice and fairness. However, the
development of legal case retrieval technologies in the Chinese legal system is
restricted by three problems in existing datasets: limited data size, narrow
definitions of legal relevance, and naive candidate pooling strategies used in
data sampling. To alleviate these issues, we introduce LeCaRDv2, a large-scale
Legal Case Retrieval Dataset (version 2). It consists of 800 queries and 55,192
candidates extracted from 4.3 million criminal case documents. To the best of
our knowledge, LeCaRDv2 is one of the largest Chinese legal case retrieval
datasets, providing extensive coverage of criminal charges. Additionally, we
enrich the existing relevance criteria by considering three key aspects:
characterization, penalty, procedure. This comprehensive criteria enriches the
dataset and may provides a more holistic perspective. Furthermore, we propose a
two-level candidate set pooling strategy that effectively identify potential
candidates for each query case. It's important to note that all cases in the
dataset have been annotated by multiple legal experts specializing in criminal
law. Their expertise ensures the accuracy and reliability of the annotations.
We evaluate several state-of-the-art retrieval models at LeCaRDv2,
demonstrating that there is still significant room for improvement in legal
case retrieval. The details of LeCaRDv2 can be found at the anonymous website
https://github.com/anonymous1113243/LeCaRDv2
Toward a density Corr\'{a}di--Hajnal theorem for degenerate hypergraphs
Given an -graph with , let denote
the maximum number of edges in an -vertex -graph with at most
pairwise vertex-disjoint copies of . Extending several old results and
complementing prior work [J. Hou, H. Li, X. Liu, L.-T. Yuan, and Y. Zhang. A
step towards a general density Corr\'{a}di--Hajnal theorem. arXiv:2302.09849,
2023.] on nondegenerate hypergraphs, we initiate a systematic study on
for degenerate hypergraphs . For a broad class of
degenerate hypergraphs , we present near-optimal upper bounds for
when is sufficiently large and lies in
intervals ,
,
and , where
is a constant depending only on . Our results reveal very
different structures for extremal constructions across the three intervals, and
we provide characterizations of extremal constructions within the first
interval. Additionally, for graphs, we offer a characterization of extremal
constructions within the second interval. Our proof for the first interval also
applies to a special class of nondegenerate hypergraphs, including those with
undetermined Tur\'{a}n densities, partially improving a result in [J. Hou, H.
Li, X. Liu, L.-T. Yuan, and Y. Zhang. A step towards a general density
Corr\'{a}di--Hajnal theorem. arXiv:2302.09849, 2023.]Comment: 37 pages, 4 figures, comments are welcom
Highly efficient influenza virus production: A MDCK-based high-cell-density process
Seasonal vaccination campaigns for influenza in developed and developing countries create a massive demand for 500 million (2015) vaccine doses every year [1]. Besides egg-based vaccine manufacturing, production platforms based on animal cell culture increasingly contribute to this overall growing market. In order to intensify cell culture-based influenza virus production, high-cell-density (HCD) cultivation of suspension cells can be applied to improve virus titer, process productivity and production costs [2]. For process optimization and evaluation of HCD conditions, cells cultivated using semi-perfusion approaches in small shakers can be used as a scale-down model for bioreactors operating in full perfusion mode [3].
In this study, a previously developed MDCK suspension cell line [4] was adapted to a new serum free medium [5] to facilitate higher growth rate, cell density and virus titer both in batch and in HCD. Therefore, MDCK cells cultivated in Smif-8 medium were slowly adapted to a new cultivation medium (Xeno™) by stepwise increasing the Xeno content. Fully adapted cells were cultivated in shaker flasks to evaluate the performance of influenza A virus production in batch and HCD. Cell densities exceeding 2∙107 cells/mL were achieved in shakers using semi-perfusion, where cell free medium was manually replaced with fresh medium. Volume and time interval of media replacement were chosen to achieve a constant cell-specific perfusion rate of 2.5 pL/(cell h). Cell cultures were infected with influenza virus (A/PR/8/34 H1N1 RKI) with trypsin addition. Cell count, viability, main metabolites and virus titer (HA-assay & TCID50) were monitored pre and post infection.
Medium adaptation resulted in a MDCK suspension cell line with morphological, growth, and metabolic characteristics different from parental cells. Cells fully adapted to Xeno medium were growing to higher cell densities (1.4∙107 vs 6∙106 cells/mL) with higher specific growth rate (µmax: 0.036 vs 0.026 1/h), cells were bigger (15-16 vs 13-14 µm) and grew without aggregate formation. Due to higher cell densities at time of infection, virus titers up to 3.6 log10(HAU/100µL) were reached. In semi-perfusion, adapted MDCK cells were grown up to 6∙107 cells/mL, however, maximum virus titer and productivity were observed with 4∙107 cells/mL. In multiple harvests, very high virus titer exceeding 4 log10(HAU/100µL) and up to 9∙109 virions/mL (TCID50) were measured, which corresponded to an accumulated titer of 4.5 log10(HAU/100µL). Cell-specific virus titer was similar or higher compared to the reference batch infections, depending on perfusion and infection strategy.
Overall, results in this semi-perfusion scale-down model for influenza A virus production suggest a highly efficient and productive upstream process for influenza virus production, with an up to six-fold improved space time yield compared to batch mode.
[1] Palache A. et al., Vaccine 35 (2017): 4681–4686. doi: 10.1016/j.vaccine.2017.07.053
[2] Genzel Y. et al., Vaccine 32 (2014): 2770–2781. doi: 10.1016/j.vaccine.2014.02.016
[3] Vázquez-RamÃrez D. et al., Vaccine (2018): article in press. doi: 10.1016/j.vaccine.2017.10.112
[4] Lohr V. et al., Vaccine 28 (2010): 6256–6264. doi: 10.1016/j.vaccine.2010.07.004
[5] Xenoâ„¢-S001S MDCK Cell Serum Free Medium (#FG0100402), Bioengine, Shanghai, Chin
- …