5 research outputs found

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Full text link
    The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5

    Recent Progress in Metal Oxide Based Materials as Anode Materials for LithiumIon Batteries

    Full text link
    Transition metal oxides are promising anode materials for Li-ion batteries due to their high theoretical capacity (> 600 mAhg-1), good rate capability, good safety, environmental friendliness and relatively low cost. In recent years, more and more investigators are motivating this advanced material rapidly towards their hybrid materials preparation to improve the cycling performance. Herein, we outlined the current research advances of transition metal oxides and their hybrid materials as anode materials for Li-ion batteries in the synthesis, lithium storage mechanism and electrochemical performance, with the aim of stimulating more research to achieve more useful application as soon as possible

    Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022

    Full text link
    The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support global research in both academia and industry. With the explosively accumulated multi-omics data at ever-faster rates, CNCB-NGDC is constantly scaling up and updating its core database resources through big data archive, curation, integration and analysis. In the past year, efforts have been made to synthesize the growing data and knowledge, particularly in single-cell omics and precision medicine research, and a series of resources have been newly developed, updated and enhanced. Moreover, CNCB-NGDC has continued to daily update SARS-CoV-2 genome sequences, variants, haplotypes and literature. Particularly, OpenLB, an open library of bioscience, has been established by providing easy and open access to a substantial number of abstract texts from PubMed, bioRxiv and medRxiv. In addition, Database Commons is significantly updated by cataloguing a full list of global databases, and BLAST tools are newly deployed to provide online sequence search services
    corecore