244 research outputs found
Academic institutional repositories in China: A survey of CALIS member libraries
Purpose: China Academic Library & Information System (CALIS) planned to launch an institutional repository (IR) project to promote IR development and open access at colleges and universities in China. In order to get to know the current state of IRs in academic institutions, with the help of Peking University Library, CALIS Administrative Center conducted this survey.Design/methodology/approach: We conducted an online survey of CALIS member libraries.Findings: Firstly, the development of IRs at China's colleges and universities is still in its infancy. Secondly, the Chinese colleges and universities have reached a consensus on the objective for having an IR. Thirdly, they are having high expectations of IR functions. Fourthly, they prefer to establish a centralized IR system at a minimum cost. Finally, there are both similarities and differences between the Chinese academic institutions and their counterparts in other countries in the state of IR development.Research limitations: The questionnaire needs to be improved because there is a lack of enough questions for those who do not plan to build an IR. Comparatively lower rate of valid questionnaire return can affect the accuracy of the results. It is hard to go into an in-depth discussion only based on the data collected from this questionnaire survey, and consequently, the findings from the survey can hardly present an accurate and comprehensive picture of the current state of IR development in the academic sector in China.Practical implications: The survey results provide essential foundation for CALIS IR project, and meanwhile the research can serve as a reference source for the future studies of the development of IRs at China's colleges and universities.Originality/value: It is the first national survey focused on the development of IRs in academic institutions in China.</p
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
In the complex domain of large language models (LLMs), striking a balance
between computational efficiency and maintaining model quality is a formidable
challenge. Navigating the inherent limitations of uniform quantization,
particularly when dealing with outliers, and motivated by the launch of
NVIDIA's H100 hardware, this study delves into the viability of floating-point
(FP) quantization, particularly focusing on FP8 and FP4, as a potential
solution. Our comprehensive investigation reveals that for LLMs, FP8 activation
consistently outshines its integer (INT8) equivalent, with the performance edge
becoming more noticeable in models possessing parameters beyond one billion.
For weight quantization, our findings indicate that FP4 exhibits comparable, if
not superior, performance to INT4, simplifying deployment on FP-supported
hardware like H100. To mitigate the overhead from precision alignment caused by
the disparity between weights and activations, we propose two scaling
constraints for weight quantization that negligibly impact the performance
compared to the standard W4A8 model. We additionally enhance our quantization
methods by integrating the Low Rank Compensation (LoRC) strategy, yielding
improvements especially in smaller models. The results of our investigation
emphasize the immense potential of FP quantization for LLMs, paving the way for
high-efficiency deployment in resource-limited settings
Food-delivery behavior under crowd sourcing mobility services
The rapid development of the online food-delivery industry, has led to not only increases in the number of the crowd-sourced shared food-delivery service drivers on our roads, but also growing urban traffic safety management concerns. This study investigates the decision-making behaviors that exist between delivery drivers, their food-delivery platform and their potential impact on traffic safety. Using the evolutionary game theory, stakeholder decision-making behaviors involving traffic safety within the food-delivery industry were analyzed. From our analysis, several behavioral influencers were identified, including penalties for traffic violations, the opportunity cost of delivery drivers complying with traffic rules, the costs associated with risk and strict management approaches, reputation incentives, costs related to the delivery platform being punished, the probability of compliance with traffic rules, and the probability of adopting a strict management approach by the delivery platform. Our study demonstrates that stabilization strategies used by the food service industry differ when the types of government control measures also differ. When the government takes a more aggressive approach to regulation and control, compliance with the traffic rules and the adoption of strict enforcement measures by management are the only evolutionary stability strategies available to food-delivery platforms. As part of a strict management strategy, appropriate compensation or incentive measures should be provided by the distribution platform. Furthermore, the fines given for traffic violations should be increased to create a safer road environment that has fewer traffic accidents involving food-delivery drivers
ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers
Quantization techniques are pivotal in reducing the memory and computational
demands of deep neural network inference. Existing solutions, such as
ZeroQuant, offer dynamic quantization for models like BERT and GPT but overlook
crucial memory-bounded operators and the complexities of per-token
quantization. Addressing these gaps, we present a novel, fully
hardware-enhanced robust optimized post-training W8A8 quantization framework,
ZeroQuant-HERO. This framework uniquely integrates both memory bandwidth and
compute-intensive operators, aiming for optimal hardware performance.
Additionally, it offers flexibility by allowing specific INT8 modules to switch
to FP16/BF16 mode, enhancing accuracy.Comment: 8 pages, 2 figure
Perceived Discrimination and Life Satisfaction of Elderly Chinese People: The Chain Mediating Effects of National Identity and Sense of Community
In China, aging is becoming an increasingly serious issue, and the Chinese government are paying more attention to the life satisfaction of the elderly. Nevertheless, in their daily lives, the elderly are often discriminated against, which may have a negative impact on their life satisfaction. To enable a better understanding of these relationships, we discuss the factors affecting the macro-system (national identity) and micro-system (sense of community) of the elderly. Three hundred and ninety-one elderly people (60–101 years old; 121 males, 270 females) from three communities in the Anhui and Shandong provinces of China participated in our study. Each participant completed the appropriate questionnaires, including: perceived discrimination measure, national identity questionnaire, sense of community questionnaire, and life satisfaction questionnaire. The results of structural equation modeling revealed that perceived discrimination negatively influenced life satisfaction through national identity and community. Perceived discrimination was found to negatively predict national identity, suggesting that perceived discrimination brings a negative influence to national identity within Chinese culture. The relationship between perceived discrimination and life satisfaction was partially mediated by the chain of national identity and sense of community. The size of the total mediation effect was 32.17%. The relationship between perceived discrimination and life satisfaction, when mediated by national identity or sense of community, was not significant. This suggests that the application of the rejection-identification model to the elderly in China may produce different results. The limitations and the implications of our study were considered in discussion
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers
Large-scale transformer models have become the de-facto architectures for
various machine learning applications, e.g., CV and NLP. However, those large
models also introduce prohibitive training costs. To mitigate this issue, we
propose a novel random and layerwise token dropping method (random-LTD), which
skips the computation of a subset of the input tokens at all middle layers.
Particularly, random-LTD achieves considerable speedups and comparable accuracy
as the standard training baseline. Compared to other token dropping methods,
random-LTD does not require (1) any importance score-based metrics, (2) any
special token treatment (e.g., [CLS]), and (3) many layers in full sequence
length training except the first and the last layers. Besides, a new LayerToken
learning rate schedule is proposed for pretraining problems that resolve the
heavy tuning requirement for our proposed training mechanism. Finally, we
demonstrate that random-LTD can be applied to broader applications, including
GPT and BERT pretraining as well as ViT and GPT finetuning tasks. Our results
show that random-LTD can save about 33.3% theoretical compute cost and 25.6%
wall-clock training time while achieving similar zero-shot evaluations on
GPT-31.3B as compared to baseline.Comment: 22 page
Beneficial effects of baicalein on a model of allergic rhinitis
Allergic rhinitis (AR) is a common disease that causes severe inflammation and even disabilities. Previous studies have reported baicalein to have an anti-inflammatory effect. However, the pharmacological action of baicalein on anaphylaxis has not been clarified yet. This study assessed the in vivo protective effect of baicalein post-treatment in an ameliorating ovalbumin (OVA)-sensitized AR rat model. Baicalein attenuated histological alterations, aberrant tissue repair and inflammation after OVA-induced AR. Baicalein reduced the frequency of nasal/ear rubs and sneezes in rats, and inhibited generation of several inflammatory cytokines (TNF-α, IL-1β, and IL-6) in both blood and nasal lavage of rats. Infiltrations of eosinophils, lymphocyte, and neutrophils were decreased in baicalein-administered rats. Furthermore, baicalein inhibited the expression of STAT3 phosphorylation in the nasal mucosa. In summary, baicalein attenuated OVA-induced AR and inflammation, which suggests it as a promising therapeutic agent for the alleviation of AR-associated inflammation and pathology
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention
Most of the existing multi-modal models, hindered by their incapacity to
adeptly manage interleaved image-and-text inputs in multi-image, multi-round
dialogues, face substantial constraints in resource allocation for training and
data accessibility, impacting their adaptability and scalability across varied
interaction realms. To address this, we present the DeepSpeed-VisualChat
framework, designed to optimize Large Language Models (LLMs) by incorporating
multi-modal capabilities, with a focus on enhancing the proficiency of Large
Vision and Language Models in handling interleaved inputs. Our framework is
notable for (1) its open-source support for multi-round and multi-image
dialogues, (2) introducing an innovative multi-modal causal attention
mechanism, and (3) utilizing data blending techniques on existing datasets to
assure seamless interactions in multi-round, multi-image conversations.
Compared to existing frameworks, DeepSpeed-VisualChat shows superior
scalability up to 70B parameter language model size, representing a significant
advancement in multi-modal language models and setting a solid foundation for
future explorations
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Text-to-image generation (TTI) refers to the usage of models that could
process text input and generate high fidelity images based on text
descriptions. Text-to-image generation using neural networks could be traced
back to the emergence of Generative Adversial Network (GAN), followed by the
autoregressive Transformer. Diffusion models are one prominent type of
generative model used for the generation of images through the systematic
introduction of noises with repeating steps. As an effect of the impressive
results of diffusion models on image synthesis, it has been cemented as the
major image decoder used by text-to-image models and brought text-to-image
generation to the forefront of machine-learning (ML) research. In the era of
large models, scaling up model size and the integration with large language
models have further improved the performance of TTI models, resulting the
generation result nearly indistinguishable from real-world images,
revolutionizing the way we retrieval images. Our explorative study has
incentivised us to think that there are further ways of scaling text-to-image
models with the combination of innovative model architectures and prediction
enhancement techniques. We have divided the work of this survey into five main
sections wherein we detail the frameworks of major literature in order to delve
into the different types of text-to-image generation methods. Following this we
provide a detailed comparison and critique of these methods and offer possible
pathways of improvement for future work. In the future work, we argue that TTI
development could yield impressive productivity improvements for creation,
particularly in the context of the AIGC era, and could be extended to more
complex tasks such as video generation and 3D generation
Functional Connectivity Density, Local Brain Spontaneous Activity, and Their Coupling Strengths in Patients With Borderline Personality Disorder
In this study, combining degree centrality (DC) and fractional amplitude of low frequency fluctuation (fALFF) analyses of resting state (rs)-functional magnetic resonance imaging (fMRI) data, we aimed to explore functional connectivity density, local brain spontaneous activity, and their coupling strengths in borderline personality disorder (BPD). Forty-three BPD patients and 39 demographically-matched controls underwent rs-fMRI after completing a series of psychological tests. Two-sample t-tests were performed to compare DC and fALFF between these two groups. Across-voxel correlation analysis was conducted to assess DC-fALFF coupling strengths in each group. Imaging parameters and psychological variables were correlated by Pearson correlation analysis in the BPD group. Altered DC and fALFF values in the BPD group, compared with the control group, were distributed mainly in default mode network (DMN), and DC-fALFF coupling strengths were decreased in the left middle temporal gyrus (MTG) and right precuneus in the BPD group. Additionally, insecure attachment scores correlated positively with left precuneus DC and negatively with fALFF of the right posterior cingulate cortex (PCC) in the BPD group. These altered DC and fALFF findings indicate that the BPD patients had disturbed functional connectivity density and local spontaneous activity in the DMN compared with control subjects. Their decreased connectivity-amplitude coupling suggests that the left MTG and right precuneus may be functional impairment hubs in BPD. Disturbed rs function in the left precuneus and right PCC might underlie insecure attachment in BPD
- …