172 research outputs found
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model
characterized by economical training and efficient inference. It comprises 236B
total parameters, of which 21B are activated for each token, and supports a
context length of 128K tokens. DeepSeek-V2 adopts innovative architectures
including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees
efficient inference through significantly compressing the Key-Value (KV) cache
into a latent vector, while DeepSeekMoE enables training strong models at an
economical cost through sparse computation. Compared with DeepSeek 67B,
DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves
42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum
generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality
and multi-source corpus consisting of 8.1T tokens, and further perform
Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock
its potential. Evaluation results show that, even with only 21B activated
parameters, DeepSeek-V2 and its chat versions still achieve top-tier
performance among open-source models
Optimal Measurement Planning Using Fuzzy-set Theory
In precision measurement, it is known that a measurement process involves errors or factors of different kinds and types. Using the prior knowledge on the relationship between the measured variables and the factors, the best measurement plan may be obtainable if a target function on the errors is minimized. In many cases, however, the relationship may not be clear since non-quantitative factors are involved, making the finding of the best plan using methods such as the statistics method quite difficult. A new method based the fuzzy-set theory is proposed to solve this problem. In this method, the membership grade is maximized. The concept of quasi-perfect plan is presented. Mathematical modes are established and case studies are presented in order to demonstrate the feasibility of the proposed method
An Efficient Cross-lingual Model for Sentence Classification Using Convolutional Neural Network
Estimation of Non-statistical Uncertainty Using Fuzzy-set Theory
A novel method using a fuzzy practicable interval to characterize non-statistical uncertainty in dynamic measurement is proposed. The method permits the uncertainty being estimated under the conditions that the number of measurements is very small and the probability distribution unknown. The feasibility of the method is validated by computer-simulation experiments
Automatic control over roundness error of bearing ring grinding surface based on quasi-dynamical harmonic generating theory
Mineral chemistry and geochemistry of peridotites from the Zedang and Luobusa ophiolites, Tibet: Implications for the evolution of the Neo-Tethys
Nursing of Patients Critically Ill With Coronavirus Disease Treated With Extracorporeal Membrane Oxygenation
Fuzzy-set-based Optimal Selection of Measurement Plans
In many cases we may have several approaches to realise a precision measurement. It is very important to decide on an optimal measurement approach or plan in order to obtain effectively high-fidelity results. A measurement process may involve many kinds of error factors which vary in different measurement plans. This would make the selection of the most appropriate approach difficult. Currently, prior knowledge of the relationship between measured variables and error factors, and statistical methods are typically used for such selection. In many cases, however, the relationship may not be clear, due to non-quantitative factors being involved. To solve the problem, a new method that is based on fuzzy set theory is proposed. In this method, membership grades are established and grades to the quasi-perfect plan are maximised. Mathematical models are established and the concept of a quasi-perfect measurement plan is proposed. Experimental testing is presented in order to demonstrate the effectiveness of the proposed method
- …
