38 research outputs found

    ChineseWebText: Large-scale High-quality Chinese Web Text Extracted with Effective Evaluation Model

    Full text link
    During the development of large language models (LLMs), the scale and quality of the pre-training data play a crucial role in shaping LLMs' capabilities. To accelerate the research of LLMs, several large-scale datasets, such as C4 [1], Pile [2], RefinedWeb [3] and WanJuan [4], have been released to the public. However, most of the released corpus focus mainly on English, and there is still lack of complete tool-chain for extracting clean texts from web data. Furthermore, fine-grained information of the corpus, e.g. the quality of each text, is missing. To address these challenges, we propose in this paper a new complete tool-chain EvalWeb to extract Chinese clean texts from noisy web data. First, similar to previous work, manually crafted rules are employed to discard explicit noisy texts from the raw crawled web contents. Second, a well-designed evaluation model is leveraged to assess the remaining relatively clean data, and each text is assigned a specific quality score. Finally, we can easily utilize an appropriate threshold to select the high-quality pre-training data for Chinese. Using our proposed approach, we release the largest and latest large-scale high-quality Chinese web text ChineseWebText, which consists of 1.42 TB and each text is associated with a quality score, facilitating the LLM researchers to choose the data according to the desired quality thresholds. We also release a much cleaner subset of 600 GB Chinese data with the quality exceeding 90%

    A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters

    Get PDF
    Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT. Despite its growing popularity, little to no attention has been paid to standardizing and analyzing the design of few-shot experiments. In this work, we highlight a fundamental risk posed by this shortcoming, illustrating that the model exhibits a high degree of sensitivity to the selection of few shots. We conduct a large-scale experimental study on 40 sets of sampled few shots for six diverse NLP tasks across up to 40 languages. We provide an analysis of success and failure cases of few-shot transfer, which highlights the role of lexical features. Additionally, we show that a straightforward full model finetuning approach is quite effective for few-shot transfer, outperforming several state-of-the-art few-shot approaches. As a step towards standardizing few-shot crosslingual experimental designs, we make our sampled few shots publicly available

    Probiotic Lactobacillus rhamnosus GR-1 supplementation attenuates Pb-induced learning and memory deficits by reshaping the gut microbiota

    Get PDF
    Lead (Pb) exposure during early life has been associated with an increased risk of neurodevelopmental disorders, including learning and memory deficits. The intestinal flora, via the microbiome–gut–brain axis, could play a significant role in the nervous system. However, the effects of probiotics on ameliorating Pb-induced learning and memory deficits are still unclear. In this study, we showed that adolescent Pb exposure (150 ppm) for 2 months impaired spatial learning and memory ability, accompanied by the decreasing diversity of gut microbiota, and the decreasing abundance of Lactobacillus at the genus level. Surprisingly, administration of the Lactobacillus rhamnosus GR-1 (1010 organisms/rat/day), not L. rhamnosus LGG or Lactobacillus reuteri RC-14, reversed learning and memory deficits induced by Pb exposure. Meanwhile, administration of the L. rhamnosus GR-1 increased the diversity of the gut microbiota composition and partially normalized the genus level of Lactobacillus, Parabacteroides, Enterococcus, and Akkermansia in Pb-exposed rats. Notably, supplementation of L. rhamnosus GR-1 decreased the gut permeability of Pb-exposed rats, reduced proinflammatory cytokines [interleukin-1β (IL-1β) and IL-6] expression, and promoted anti-inflammatory cytokines [granulocyte colony-stimulating factor (G-CSF)] expression. Interestingly, neural cell treatment with G-CSF rescued Pb-induced neurotoxicity. In general, L. rhamnosus GR-1 supplementation recovered the Pb-induced loss of intestinal bacteria (Lactobacillus), which may have reversed the damage to learning and memory ability. Collectively, our findings demonstrate an unexpectedly pivotal role of L. rhamnosus GR-1 in Pb-induced cognitive deficits and identify a potential probiotic therapy for cognitive dysfunction during early life

    Ring-Pins combined with cable cerclage for the fixation of displaced inferior patellar pole fractures

    Get PDF
    ObjectiveThe study aimed to present the clinical results and complication rates of ring-pins with cable cerclage for treating the inferior pole of patella fracture.MethodA study that retrospectively reviewed consecutive patients of the displaced inferior pole of patella fracture (AO/OTA 34-A1) operated with a ring-pin tension band using cable cerclage between October 2015 and October 2017 was performed. The duration of surgery, motion range of the knee, function outcomes, and complications were recorded.ResultsThe average follow-up of 31 patients was 21 months. The mean operation time was 50 min. Fractures in all 31 patients healed at a mean duration of 8 weeks. There was no infection, no withdrawing of ring-pins, no implant breakage, and no loss of fracture reduction. The mean range of motion was 120°, and no patient complained of implant irritation at the final follow-up. The average Bostman score was 29.0 points, and 28 patients graded clinical outcomes excellent and 3 patients graded clinical outcomes good at the last follow-up.ConclusionsRing-pin combined with cable cerclage for treating the displaced inferior pole of patellar fracture is simple, and the postoperative internal fixation-related complication rate is low. It is a good choice for treating the displaced inferior pole of the patellar fracture

    Variasi Temperatur Pencampuran Terhadap Parameter Marshall Pada Campuran Lapis Aspal Beton

    Full text link
    This study was conducted to determine the effect of temperature variations on the mixing processof the asphalt concrete AC-WC (Asphalt Concrete-Wearing Course) subtle gradations in themiddle limit and lower limit of the Marshall parameters with reference to specifications of BinaMarga, 2010.From the results of experiments conducted that the optimum asphalt content is used to middle limitusing a asphalt content of 5,7% and 6,8% for the lower limit after that mixing was done usingtemperature variation of 120 o C, 130 o C, 140 o C, 150 o C, and 160 o C.To a mixture of Laston AC-WC subtle gradations middle limit grading 5,7% asphalt contentmixing temperature using a temperature of 120 o C, 130 o C, 140 o C, 150 o C, 160 o C and still meet allstandards of marshall parameters. Ideal mixing temperature variations in the middle limit ofmixing temperature 150 o C-160 o C. While the lower limit to the level of 6,8% asphalt contentmixing temperatures between 120 o C-160 o C did not meet the specifications, because the MQ valuebelow the minimum value of 250 kg / mm

    Alginate/albumin in incubation solution mediates the adhesion and biofilm formation of typical marine bacteria and algae

    No full text
    Adhesion of microorganisms in the marine environment is one of the initial events responsible for the occurrence of biofouling. A variety of factors play roles in regulating adhesion behaviors and subsequent biofilm formation. Here the study was focused on the influence of the typical marine polysaccharide alginate and the protein albumin on the attachment and colonization of Bacillus sp., Chlorella pyrenoidosa and Phaeodactylum tricornutum to silicon wafers in sterile artificial seawater. The rapid formation of conditioning layers due to the adsorption of the molecules was revealed by atomic force microscopy, and porous layers with the thickness of 3-6 nm further altered the surface roughness and wettability of the substrata. The presence of alginate or albumin in the culture solution tailored the surface properties of C. pyrenoidosa and P. tricornutum. The thickness, structure heterogeneity, biomass, diffusion distance, and roughness coefficient of the biofilm formed by colonization of the microorganisms were examined and their values showed that alginate/albumin had a significant influence on biofilm formation. The results are relevant to biofouling research on exploring antifouling strategies at the molecular level

    Alginate/albumin in incubation solution mediates the adhesion and biofilm formation of typical marine bacteria and algae

    No full text
    Adhesion of microorganisms in the marine environment is one of the initial events responsible for the occurrence of biofouling. A variety of factors play roles in regulating adhesion behaviors and subsequent biofilm formation. Here the study was focused on the influence of the typical marine polysaccharide alginate and the protein albumin on the attachment and colonization of Bacillus sp., Chlorella pyrenoidosa and Phaeodactylum tricornutum to silicon wafers in sterile artificial seawater. The rapid formation of conditioning layers due to the adsorption of the molecules was revealed by atomic force microscopy, and porous layers with the thickness of 3-6 nm further altered the surface roughness and wettability of the substrata. The presence of alginate or albumin in the culture solution tailored the surface properties of C. pyrenoidosa and P. tricornutum. The thickness, structure heterogeneity, biomass, diffusion distance, and roughness coefficient of the biofilm formed by colonization of the microorganisms were examined and their values showed that alginate/albumin had a significant influence on biofilm formation. The results are relevant to biofouling research on exploring antifouling strategies at the molecular level

    Alginate/albumin in incubation solution mediates the adhesion and biofilm formation of typical marine bacteria and algae

    No full text
    Alginate/albumin in incubation solution mediates the adhesion and biofilm formation of typical marine bacteria and alga

    Morphological Analysis of Fractures of the Proximal Humerus by the Fracture Mapping Technique

    No full text
    Objective Fractures of different parts of the proximal humerus may lead to different postoperative functional deficits, but there are few studies on the morphology and related functions of the proximal humerus. The purpose of this study was to analyze the fracture pattern of the proximal humerus by the three‐dimensional (3‐D) fracture mapping technique and to further evaluate its clinical utility. Methods Patients with proximal humeral fractures admitted to Pudong Hospital, Fudan University, from January 2018 to December 2020, were analyzed. Three surgeons divided the fractures into groups according to the 3‐D CT imaging technique and mapped the fractures on a 3‐D template according to the fracture line of each fracture. Finally, the humeral head inversion angle and the functional score were recorded in different fracture types. Results A total of 312 cases of humeral fractures were included. Among them, there were 90 patients (28.8%) in the simple greater tuberosity + lesser tuberosity + medial cortex group, with typical fracture features of surgical neck fractures of the humerus + greater tuberosity fractures. Eighty‐seven patients (27.9%) in the greater tuberosity + isolated fragment lesser tuberosity + medial cortex group had typical “four‐part fractures.” There were 45 patients (14.4%) in the greater tuberosity + lesser tuberosity + medial isolated fragment group. Moreover, more patients in this group had medial comminution due to varus displacement of the femoral head. There were 66 patients (21.1%) in the isolated greater tuberosity group, 21 patients (6.7%) in the greater tuberosity + lesser tuberosity group, and three patients (1.0%) in the greater tuberosity + medial cortex group. In addition, the humeral head inversion angle and other statistical differences were observed in the greater tuberosity + lesser tuberosity + medial isolated fragment group. Conclusions This morphological study helps to further identify the characteristics of proximal humerus fracture patterns, which may be closely related to different clinical outcomes. Further relevant studies are needed to verify the reliability of their clinical application and the potential value in surgical planning and postoperative functional rehabilitation

    动物黏液: 屏障效应及黏附机制

    No full text
    corecore