2,431 research outputs found

    Seq2Seq-SC: End-to-End Semantic Communication Systems with Pre-trained Language Model

    Full text link
    While semantic communication is expected to bring unprecedented communication efficiency in comparison to classical communication, many challenges must be resolved to realize its potential. In this work, we provide a realistic semantic network dubbed seq2seq-SC, which is compatible to 5G NR and can work with generalized text dataset utilizing pre-trained language model. We also utilize a performance metric (SBERT) which can accurately measure semantic similarity and show that seq2seq-SC achieves superior performance while extracting semantically meaningful information.Comment: 4 pages, 2 figures, 3 table

    Making Large Language Models Better Data Creators

    Full text link
    Although large language models (LLMs) have advanced the state-of-the-art in NLP significantly, deploying them for downstream applications is still challenging due to cost, responsiveness, control, or concerns around privacy and security. As such, trainable models are still the preferred option in some cases. However, these models still require human-labeled data for optimal performance, which is expensive and time-consuming to obtain. In order to address this issue, several techniques to reduce human effort involve labeling or generating data using LLMs. Although these methods are effective for certain applications, in practice they encounter difficulties in real-world scenarios. Labeling data requires careful data selection, while generating data necessitates task-specific prompt engineering. In this paper, we propose a unified data creation pipeline that requires only a single formatting example, and which is applicable to a broad range of tasks, including traditionally problematic ones with semantically devoid label spaces. In our experiments we demonstrate that instruction-following LLMs are highly cost-effective data creators, and that models trained with these data exhibit performance better than those trained with human-labeled data (by up to 17.5%) on out-of-distribution evaluation, while maintaining comparable performance on in-distribution tasks. These results have important implications for the robustness of NLP systems deployed in the real-world.Comment: Accepted to EMNLP 2023 main conference. 12 pages, 5 figures, 6 tables. Code is available at https://github.com/microsoft/llm-data-creatio

    Applying Cytogenetics in Phylogenetic Studies

    Get PDF
    Cytogenetics, with its fundamental role in the field of genetic investigation, continues to be an indispensable tool for studying phylogenetics, given that currently molecular evolutionary analyses are more commonly utilized. Chromosomal evolution indicated that genomic evolution occurs at the level of chromosomal segments, namely, the genomic blocks in the size of Mb‐level. The recombination of homologous blocks, through the mechanisms of insertion, translocation, inversion, and breakage, has been proven to be a major mechanism of speciation and subspecies differentiation. Meanwhile, molecular cytogenetics (fluorescence in situ hybridization‐based methodologies) had been already widely applied in studying plant genetics since polyploidy is common in plant evolution and speciation. It is now recognized that comparative cytogenetic studies can be used to explore the plausible phylogenetic relationships of the extant mammalian species by reconstructing the ancestral karyotypes of certain lineages. Therefore, cytogenetics remains a feasible tool in the study of comparative genomics, even in this next generation sequencing (NGS) prevalent era

    Analyzing Norm Violations in Live-Stream Chat

    Full text link
    Toxic language, such as hate speech, can deter users from participating in online communities and enjoying popular platforms. Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter. These approaches are less effective when applied to conversations on live-streaming platforms, such as Twitch and YouTube Live, as each comment is only visible for a limited time and lacks a thread structure that establishes its relationship with other comments. In this work, we share the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms. We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch. We articulate several facets of live-stream data that differ from other forums, and demonstrate that existing models perform poorly in this setting. By conducting a user study, we identify the informational context humans use in live-stream moderation, and train models leveraging context to identify norm violations. Our results show that appropriate contextual information can boost moderation performance by 35\%.Comment: 17 pages, 8 figures, 15 table

    AutoTriggER: Named Entity Recognition with Auxiliary Trigger Extraction

    Full text link
    Deep neural models for low-resource named entity recognition (NER) have shown impressive results by leveraging distant super-vision or other meta-level information (e.g. explanation). However, the costs of acquiring such additional information are generally prohibitive, especially in domains where existing resources (e.g. databases to be used for distant supervision) may not exist. In this paper, we present a novel two-stage framework (AutoTriggER) to improve NER performance by automatically generating and leveraging "entity triggers" which are essentially human-readable clues in the text that can help guide the model to make better decisions. Thus, the framework is able to both create and leverage auxiliary supervision by itself. Through experiments on three well-studied NER datasets, we show that our automatically extracted triggers are well-matched to human triggers, and AutoTriggER improves performance over a RoBERTa-CRFarchitecture by nearly 0.5 F1 points on average and much more in a low resource setting.Comment: 10 pages, 12 figures, Best paper at TrustNLP@NAACL 2021 and presented at WeaSuL@ICLR 202

    Dose selection method for pharmacokinetic study in hemodialysis patients using a subpharmacological dose: oseltamivir as a model drug

    Get PDF
    BACKGROUND: Dose selection is an important step in pharmacokinetic (PK) studies of hemodialysis patients. We propose a simulation-based dose-selection method for PK studies of hemodialysis patients using a subpharmacological dose of oseltamivir as a model drug. METHODS: The concentrations of oseltamivir and its active metabolite, oseltamivir carboxylate (OC), were measured by liquid chromatography-tandem mass spectrometry. To determine a low oseltamivir dose exhibiting PK linearity, a pilot low dose determination investigation (n = 4) was performed using a single administration dose-escalation study. After the dose was determined, a low dose study (n = 10) was performed, and the optimal dose required to reach the hypothetical target OC exposure (area under the concentration-time curve [AUC] of 60,000 ng · hr/mL) was simulated using a nonparametric superposition method. Finally, observed PKs at the optimal dose were compared to the simulated PKs to verify PK predictability. RESULTS: In the pilot low dose determination study, 2.5 mg of oseltamivir was determined to be the low dose. Subsequently, we performed a single-dose PK study with the low oseltamivir dose in an additional group of 10 hemodialysis patients. The predicted AUC(last) of OC following continuous oseltamivir doses was simulated, and 35 mg of oseltamivir corresponded to the hypothetical target AUC(last) of OC. The observed PK profiles of OC at a 35-mg oseltamivir dose and the simulated data based on the low dose study were in close alignment. CONCLUSION: The results indicate that the proposed method provides a rational approach to determine the proper PK dose in hemodialysis patients

    Astroblastoma: A Case Report

    Get PDF
    Astroblastoma is one of the very unusual type of tumors, whose histogenesis has not been clarified. It occurs mainly among children or young adults. Astroblastoma is grossly well-demarcated, and shows histologically characteristic perivascular pseudorosettes with frequent vascular hyalinization. Perivascular pseudorosettes in astroblastoma have short and thick cytoplasmic processes and blunt-ended foot plates. A 15-yr-old girl presented with headache and diplopia for one and a half year. A well-demarcated mass, 9.7 cm in diameter, was found in the right frontal lobe in brain MRI, and it was a well-enhanced inhomogenous mass. Cystic changes of various sizes were observed inside the tumor mass as well as in the posterior part of the mass, but no peritumoral edema was found. Histologically, this mass belongs to a typical astroblastoma, and no sign of anaplastic astrocytoma, gemistocytic astrocytoma or glioblastoma was found in any part of the tumor. Immunohistochemically, the tumor cells showed diffuse strong positivity for glial fibrillary acidic protein, S-100 protein, vimentin and neuron specific enolase, and focal positivity for epithelial membrane antigen and CAM 5.2, while showing negativity for synaptophysin, neurofilament protein, pan-cytokeratin and high molecular weight keratin

    Dissociation Between the Growing Opioid Demands and Drug Policy Directions Among the U.S. Older Adults with Degenerative Joint Diseases

    Get PDF
    We aim to examine temporal trends of orthopedic operations and opioid-related hospital stays among seniors in the nation and states of Oregon and Washington where marijuana legalization was accepted earlier than any others. As aging society advances in the United States (U.S.), orthopedic operations and opioid-related hospital stays among seniors increase in the nation. A serial cross-sectional cohort study using the healthcare cost and utilization project fast stats from 2006 through 2015 measured annual rate per 100,000 populations of orthopedic operations by age groups (45–64 vs 65 and older) as well as annual rate per 100,000 populations of opioid-related hospital stays among 65 and older in the nation, Oregon and Washington states from 2008 through 2017. Orthopedic operations (knee arthroplasty, total or partial hip replacement, spinal fusion or laminectomy) and opioid-related hospital stays were measured. The compound annual growth rate (CAGR) was used to quantify temporal trends of orthopedic operations by age groups as well as opioid-related hospital stays and was tested by Rao–Scott correction of χ2 for categorical variables. The CAGR (4.06%) of orthopedic operations among age 65 and older increased (P...) (See full abstract in article
    corecore