26 research outputs found

    TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer

    Full text link
    We present TransNormerLLM, the first linear attention-based Large Language Model (LLM) that outperforms conventional softmax attention-based models in terms of both accuracy and efficiency. TransNormerLLM evolves from the previous linear attention architecture TransNormer by making advanced modifications that include positional embedding, linear attention acceleration, gating mechanisms, tensor normalization, and inference acceleration and stabilization. Specifically, we use LRPE together with an exponential decay to avoid attention dilution issues while allowing the model to retain global interactions between tokens. Additionally, we propose Lightning Attention, a cutting-edge technique that accelerates linear attention by more than twice in runtime and reduces memory usage by a remarkable four times. To further enhance the performance of TransNormer, we leverage a gating mechanism for smooth training and a new tensor normalization scheme to accelerate the model, resulting in an impressive acceleration of over 20%20\%. Furthermore, we develop a robust inference algorithm that ensures numerical stability and consistent inference speed, regardless of the sequence length, showcasing superior efficiency during both training and inference stages. We also implement an efficient model parallel schema for TransNormerLLM, enabling seamless deployment on large-scale clusters and facilitating expansion to even more extensive models, i.e., LLMs with 175B parameters. We validate our model design through a series of ablations and train models with sizes of 385M, 1B, and 7B on our self-collected corpus. Benchmark results demonstrate that our models not only match the performance of state-of-the-art LLMs with Transformer but are also significantly faster. Code is released at: https://github.com/OpenNLPLab/TransnormerLLM.Comment: Technical Report. Yiran Zhong is the corresponding author. Zhen Qin, Dong Li, Weigao Sun, Weixuan Sun, Xuyang Shen contribute equally to this paper. Code is released at: https://github.com/OpenNLPLab/TransnormerLL

    Peliminary exploration on the differential diagnosis between meningioma and schwannoma using contrast-enhanced T1WI flow-sensitive black-blood sequence

    Get PDF
    IntroductionContrast-enhanced T1WI flow-sensitive black-blood (CE-T1WI FSBB) is a newly developed sequence which had not been widely used for differential diagnosis of brain tumors.MethodsTo quantify the pre-operative imaging features of intratumoral microbleeds and intratumoral vessels using CE-T1WI FSBB scan and study the differences in biological behavior of meningiomas and schwannomas underlying the imaging features. Seventy-three cases of meningiomas and 24 cases of schwannomas confirmed by postoperative pathology were included. Two neuroradiologists independently counted intratumoral vessels and intratumoral microbleeds based on CE-T1WI FSBB images. The vessel density index (VDI) and microbleed density index (MDI) were the number of intratumoral vessels and the number of intratumoral microbleeds divided by the tumor volume, respectively. The consistency test of intratumoral vessel count and intratumoral microbleed count based on CE-T1WI FSBB were summarized using 2-way random intraclass correlation coefficients (ICC). Mann–Whitney U-test and chi-square test were used to determine significant differences between meningiomas and schwannomas, and fibrous meningiomas and epithelial meningiomas. P<0.05 was considered statistically significant.ResultsThe ICC of intratumoral vessels count and intratumoral microbleeds count were 0.89 and 0.99, respectively. There were significant differences in the number of intratumoral microbleeds (P<0.01) and MDI values (P<0.01) between meningiomas and schwannomas. There were no differences in the number of intratumoral vessels (P=0.64), VDI (P=0.17), or tumor volume (P=0.33). There were also differences in the number of intratumoral microbleeds (P<0.01), the MDI value (P<0.01), and the sex of patients (P<0.05) between fibrous meningiomas and epithelial meningiomas.DiscussionCE-T1WI FSBB can be a new technique for differentiating schwannomas from meningiomas, and even different types of meningiomas. Schwannomas have a higher incidence of intratumoral hemorrhage, more intratumoral microbleeds, and higher MDI values than meningiomas, which provides a new basis for preoperative differential diagnosis and treatment decisions

    Collaborative Learning in General Graphs with Limited Memorization: Learnability, Complexity and Reliability

    Full text link
    We consider K-armed bandit problem in general graphs where agents are arbitrarily connected and each of them has limited memorization and communication bandwidth. The goal is to let each of the agents learn the best arm. Although recent studies show the power of collaboration among the agents in improving the efficacy of learning, it is assumed in these studies that the communication graphs should be complete or well-structured, whereas such an assumption is not always valid in practice. Furthermore, limited memorization and communication bandwidth also restrict the collaborations of the agents, since very few knowledge can be drawn by each agent from its experiences or the ones shared by its peers in this case. Additionally, the agents may be corrupted to share falsified experience, while the resource limit may considerably restrict the reliability of the learning process. To address the above issues, we propose a three-staged collaborative learning algorithm. In each step, the agents share their experience with each other through light-weight random walks in the general graphs, and then make decisions on which arms to pull according to the randomly memorized suggestions. The agents finally update their adoptions (i.e., preferences to the arms) based on the reward feedback of the arm pulling. Our theoretical analysis shows that, by exploiting the limited memorization and communication resources, all the agents eventually learn the best arm with high probability. We also reveal in our theoretical analysis the upper-bound on the number of corrupted agents our algorithm can tolerate. The efficacy of our proposed three-staged collaborative learning algorithm is finally verified by extensive experiments on both synthetic and real datasets

    Microstructure and Mechanical Properties of Graphene Oxide-Reinforced Titanium Matrix Composites Synthesized by Hot-Pressed Sintering

    No full text
    Abstract Ti matrix composites reinforced with 1–5 wt% graphene oxide (GO) were prepared by hot-pressed sintering in argon atmosphere. The effect of sintering temperature on the microstructures and mechanical properties of the composite was also evaluated. The results show that TiC nanoparticles were formed in situ as interfacial products via the reaction between Ti and GO during sintering. With increases in GO content and sintering temperature, the amount of TiC increased, improving the mechanical properties of the composites. GO was also partly retained with a lamellar structure after sintering. The composite reinforced with 5 wt% GO exhibited a hardness of 457 HV, 48.4% higher than that of pure Ti at 1473 K. The Ti-2.5 wt% GO composite sintered at 1473 K achieved a maximum yield stress of 1294 MPa, which was 62.7 % higher than that of pure Ti. Further increasing the GO content to 5 wt% led to a slight decrease in yield stress owing to GO agglomeration. The fracture morphology of the composite reinforced with GO exhibited a quasi-cleavage fracture, whereas that of the pure Ti matrix showed a ductile fracture. The main strengthening mechanism included grain refinement, solution strengthening, and dispersion strengthening of TiC and GO

    Using Surface Immunogenic Protein as a Carrier Protein to Elicit Protective Antibody to Multiple Serotypes for Candidate Group B Streptococcal Glycan Conjugate Vaccines

    No full text
    Group B Streptococcus (GBS) is a life-threatening opportunistic pathogen, particularly in pregnant women, infants, and the elderly. Currently, maternal vaccination is considered the most viable long-term option for preventing GBS mother-to-infant infection, and two polysaccharide conjugate vaccines utilizing CRM197 as a carrier protein have undergone clinical phase II trials. Surface immunogenic protein (Sip), present in all identified serotypes of GBS strains so far, is a protective surface protein of GBS. In this study, the type Ia capsular polysaccharide (CPS) of GBS was utilized as a model to develop candidate antigens for a polysaccharide conjugate vaccine by coupling it with the Sip of GBS and the traditional carrier protein CRM197. Serum analysis from immunized New Zealand rabbits and CD1 mice revealed that there was no significant difference in antibody titers between the Ia-Sip group and Ia-CRM197 group; however, both were significantly higher than those observed in the Ia polysaccharide group. Opsonophagocytosis and passive immune protection results using rabbit serum indicated no significant difference between the Ia-Sip and Ia-CRM197 groups, both outperforming the Ia polysaccharide group. Furthermore, serum from the Ia-Sip group had a cross-protective effect on multiple types of GBS strains. The challenge test results in CD1 mice demonstrated that the Ia-Sip group provided complete protection against lethal doses of bacteria and also showed cross-protection against type III strain. Our study demonstrates for the first time that Ia-Sip is immunogenic and provides serotype-independent protection in glycan conjugate vaccines, which also indicates Sip may serve as an excellent carrier protein for GBS glycan conjugate vaccines and provide cross-protection against multiple GBS strains

    Yangxue Jiedu Fang Ameliorates Psoriasis by Regulating Vascular Regression via Survivin/PI3K/Akt Pathway

    No full text
    Background. Psoriasis (PA) is a chronic autoimmune disease of the skin that adversely affects patients’ quality of life. Yangxue Jiedu Fang (YXJD) has been used for decades to treat psoriasis in China. However, its antipsoriatic mechanisms are still poorly understood. In this study, we explored the effects of YXJD on angiogenesis and apoptosis of microvessels in PA, the underlying mechanisms in HUVEC cells transfected by Survivin overexpression plasmid and in a mouse model of imiquimod-induced psoriasis and the relationship between VEGF (vascular endothelial growth factor) and Survivin. Methods. A BALB/c mouse model of imiquimod- (IMQ-) induced PA was established, and the mice were treated with YXJD. Cell viability was assessed by CCK8 assay. Apoptosis was detected by annexin V–FITC/PI double-staining and caspase-3 assays. The PI3K/Akt/β-catenin pathway was analyzed by western blotting, ELISA, and immunochemical analysis. Results. YXJD ameliorated symptoms and psoriasis area and severity index (PASI) scores and also reduced the number of microvessels, as determined by the microvessel density (MVD). The expression of apoptotic protein Survivin in endothelial cells, autophagy-related proteins p62, and angiogenic proteins VEGF was inhibited by YXJD, and the repressed expression of LC3II/I increased by YXJD. The proteins related to the PI3K/Akt pathway and β-catenin expression and the nuclear entry of β-catenin were reduced in IMQ-induced PA mice treated with YXJD. In HUVEC cells transfected by Survivin overexpression plasmid, we observed YXJD regulated the expression of Survivin, LC3II/I, and p62, VEGF, and PI3K/Akt pathway-relative proteins and the nuclear entry of β-catenin. Conclusions. YXJD inhibited the expression of Survivin via PI3K/Akt pathway to adjust apoptosis, autophagy, and angiogenesis of microvessels and thus improve the vascular sustainability in psoriasis. YXJD may represent a new direction of drug research and development for immunomodulatory therapy for psoriasis

    Enhancing Self‐Powered Terahertz Photodetection with VSe2 and Van Der Waals Heterostructure Integration via Photothermoelectric Effect

    No full text
    Abstract Owning unique optical and electronic properties, two dimensional (2D) materials have made remarkable strides in the field of photodetection applications. However, achieving highly sensitive and ultra‐broadband detection from microwave to terahertz (THz) range (0.02–0.54 THz) remains a significant challenge for photodetectors. This study presents a self‐powered THz photodetector based on VSe2 and its van der Waals heterostructure. The photoresponse of the photodetector is primarily attributed to the photothermoelectric effect. At room temperature, the device exhibits lower noise equivalent power values of 21 pW Hz−1/2 at 0.28 THz. This work has achieved ultra‐broadband detection and demonstrated the potential for large‐area imaging, providing a new avenue for the application of THz technology in nondestructive testing and biometric identification fields

    Enhancement of terahertz response in a microstructure-integrated-type-II Dirac semimetal

    No full text
    Terahertz detection technology has been confronted with formidable impediments, notably the paucity of sensitivity and operating temperature for photodetectors based on traditional bulk materials. In an attempt to surmount the difficulties, we propose an innovative terahertz detector based on a PtSe2 (type-II Dirac semimetallic material) integrated asymmetric antenna structure that can enhance the terahertz photoresponse by capitalizing on meticulous fabrication procedures. Experimental outcomes demonstrate the remarkable characteristics of the photodetector in the terahertz band, encompassing fast response time (7 µs), large responsivity (3.267 A/W), and low noise equivalent power (3.96 pW/Hz0.5). These accomplishments can be ascribed to the incorporation of the asymmetric metal contact of the four-leaf clover antenna structure and the excellent thermoelectric characteristics of PtSe2. This pioneering investigation consequently unveils a novel methodology for the creation of proficient PtSe2-based terahertz detectors and serves as a catalyst for the promotion of applications and further research within the terahertz sphere
    corecore