98 research outputs found

    Prompt-NER: Zero-shot Named Entity Recognition in Astronomy Literature via Large Language Models

    Full text link
    This study delves into the application of Large Language Models (LLMs) for Named Entity Recognition (NER) tasks in the field of astronomy literature. To enhance the zero-shot recognition capabilities of LLMs for astronomical named entities, we propose a strategy called Prompt-NER. Prompt-NER includes five prompt elements: Task Descriptions, Entity Definitions, Task Emphasis, Task Examples, and Second Conversation. To assess the effectiveness of the Prompt-NER strategy, we utilize three representative LLMs (Claude-2, GPT-3.5, and LLaMA-2-70b) to identify telescope and celestial object named entities in astronomical literature. Our experiments are conducted based on two distinct datasets. The first dataset comprises 30 original PDF documents, which we split into paragraphs in sequential order, resulting in a second dataset consisting of 30 paragraph collections. Additionally, we incorporate 30 astronomical telegrams to diversify our experiments and assess the performance of LLMs based on Prompt-NER on concise, complete texts. Our experimental results indicate that the Prompt-NER strategy enables LLMs to effectively accomplish NER tasks in the field of astronomy, even without prior astronomical knowledge during training. We carefully analyze the experimental results, including the mechanism of different prompt elements and the influence of different features of long and short texts on their respective experimental results. This research provides experience for zero-shot NER tasks in astronomical literature and suggests future work in this area

    mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar

    Full text link
    Place recognition is crucial for tasks like loop-closure detection and re-localization. Single-chip millimeter wave radar (single-chip radar in short) emerges as a low-cost sensor option for place recognition, with the advantage of insensitivity to degraded visual environments. However, it encounters two challenges. Firstly, sparse point cloud from single-chip radar leads to poor performance when using current place recognition methods, which assume much denser data. Secondly, its performance significantly declines in scenarios involving rotational and lateral variations, due to limited overlap in its field of view (FOV). We propose mmPlace, a robust place recognition system to address these challenges. Specifically, mmPlace transforms intermediate frequency (IF) signal into range azimuth heatmap and employs a spatial encoder to extract features. Additionally, to improve the performance in scenarios involving rotational and lateral variations, mmPlace employs a rotating platform and concatenates heatmaps in a rotation cycle, effectively expanding the system's FOV. We evaluate mmPlace's performance on the milliSonic dataset, which is collected on the University of Science and Technology of China (USTC) campus, the city roads surrounding the campus, and an underground parking garage. The results demonstrate that mmPlace outperforms point cloud-based methods and achieves 87.37% recall@1 in scenarios involving rotational and lateral variations.Comment: 8 pages, 8 figure

    RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning

    Full text link
    Tool learning has generated widespread interest as a vital means of interaction between Large Language Models (LLMs) and the physical world. Current research predominantly emphasizes LLMs' capacity to utilize tools in well-structured environments while overlooking their stability when confronted with the inevitable noise of the real world. To bridge this gap, we introduce RoTBench, a multi-level benchmark for evaluating the robustness of LLMs in tool learning. Specifically, we establish five external environments, each featuring varying levels of noise (i.e., Clean, Slight, Medium, Heavy, and Union), providing an in-depth analysis of the model's resilience across three critical phases: tool selection, parameter identification, and content filling. Experiments involving six widely-used models underscore the urgent necessity for enhancing the robustness of LLMs in tool learning. For instance, the performance of GPT-4 even drops significantly from 80.00 to 58.10 when there is no substantial change in manual accuracy. More surprisingly, the noise correction capability inherent in the GPT family paradoxically impedes its adaptability in the face of mild noise. In light of these findings, we propose RoTTuning, a strategy that enriches the diversity of training environments to bolster the robustness of LLMs in tool learning. The code and data are available at https://github.com/Junjie-Ye/RoTBench

    ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios

    Full text link
    Existing evaluations of tool learning primarily focus on validating the alignment of selected tools for large language models (LLMs) with expected outcomes. However, these approaches rely on a limited set of scenarios where answers can be pre-determined, diverging from genuine needs. Furthermore, a sole emphasis on outcomes disregards the intricate capabilities essential for LLMs to effectively utilize tools. To tackle this issue, we propose ToolEyes, a fine-grained system tailored for the evaluation of the LLMs' tool learning capabilities in authentic scenarios. The system meticulously examines seven real-world scenarios, analyzing five dimensions crucial to LLMs in tool learning: format alignment, intent comprehension, behavior planning, tool selection, and answer organization. Additionally, ToolEyes incorporates a tool library boasting approximately 600 tools, serving as an intermediary between LLMs and the physical world. Evaluations involving ten LLMs across three categories reveal a preference for specific scenarios and limited cognitive abilities in tool learning. Intriguingly, expanding the model size even exacerbates the hindrance to tool learning. These findings offer instructive insights aimed at advancing the field of tool learning. The data is available att https://github.com/Junjie-Ye/ToolEyes

    StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

    Full text link
    The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit tests may not cover the complicated code, optimizing LLMs by using these unexecuted code snippets is ineffective. To tackle these challenges, we introduce StepCoder, a novel RL framework for code generation, consisting of two main components: CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks, while FGO only optimizes the model by masking the unexecuted code segments to provide Fine-Grained Optimization. In addition, we furthermore construct the APPS+ dataset for RL training, which is manually verified to ensure the correctness of unit tests. Experimental results show that our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks. Our dataset APPS+ and StepCoder are available online.Comment: 13 pages, 5 figure

    Abnormal regional signal in the left cerebellum as a potential neuroimaging biomarker of sudden sensorineural hearing loss

    Get PDF
    ObjectiveWhile prior reports have characterized visible changes in neuroimaging findings in individuals suffering from sudden sensorineural hearing loss (SSNHL), the utility of regional homogeneity (ReHo) as a means of diagnosing SSNHL has yet to be established. The present study was thus conducted to assess ReHo abnormalities in SSNHL patients and to establish whether these abnormalities offer value as a diagnostic neuroimaging biomarker of SSNHL through a support vector machine (SVM) analysis approach.MethodsResting-state functional magnetic resonance imaging (rs-fMRI) analyses of 27 SSNHL patients and 27 normal controls were conducted, with the resultant imaging data then being analyzed based on a combination of ReHo and SVM approaches.ResultsRelative to normal control individuals, patients diagnosed with SSNHL exhibited significant reductions in ReHo values in the left cerebellum, bilateral inferior temporal gyrus (ITG), left superior temporal pole (STP), right parahippocampal gyrus (PHG), left posterior cingulum cortex (PCC), and right superior frontal gyrus (SFG). SVM analyses suggested that reduced ReHo values in the left cerebellum were associated with high levels of diagnostic accuracy (96.30%, 52/54), sensitivity (92.59%, 25/27), and specificity (100.00%, 27/27) when distinguishing between SSNHL patients and control individuals.ConclusionThese data suggest that SSNHL patients exhibit abnormal resting-state neurological activity, with changes in the ReHo of the left cerebellum offering value as a diagnostic neuroimaging biomarker associated with this condition

    Towards Flexible Wireless Charging for Medical Implants Using Distributed Antenna System

    Full text link
    This paper presents the design, implementation and evaluation of In-N-Out, a software-hardware solution for far-field wireless power transfer. In-N-Out can continuously charge a medical implant residing in deep tissues at near-optimal beamforming power, even when the implant moves around inside the human body. To accomplish this, we exploit the unique energy ball pattern of distributed antenna array and devise a backscatter-assisted beamforming algorithm that can concentrate RF energy on a tiny spot surrounding the medical implant. Meanwhile, the power levels on other body parts stay in low level, reducing the risk of overheating. We prototype In-N-Out on 21 software-defined radios and a printed circuit board (PCB). Extensive experiments demonstrate that In-N-Out achieves 0.37~mW average charging power inside a 10~cm-thick pork belly, which is sufficient to wirelessly power a range of commercial medical devices. Our head-to-head comparison with the state-of-the-art approach shows that In-N-Out achieves 5.4×\times--18.1×\times power gain when the implant is stationary, and 5.3×\times--7.4×\times power gain when the implant is in motion.Comment: In MobiCom 2020: The 26th Annual International Conference on Mobile Computing and Networking, London, 15 page

    PERK-Mediated Cholesterol Excretion from IDH Mutant Glioma Determines Anti-Tumoral Polarization of Microglia

    Get PDF
    Isocitrate dehydrogenase (IDH) mutation, a known pathologic classifier, initiates metabolic reprogramming in glioma cells and has been linked to the reaction status of glioma-associated microglia/macrophages (GAMs). However, it remains unclear how IDH genotypes contribute to GAM phenotypes. Here, it is demonstrated that gliomas expressing mutant IDH determine M1-like polarization of GAMs, while archetypal IDH induces M2-like polarization. Intriguingly, IDH-mutant gliomas secrete excess cholesterol, resulting in cholesterol-rich, pro-inflammatory GAMs without altering their cholesterol biosynthesis, and simultaneously exhibiting low levels of tumoral cholesterol due to expression remodeling of cholesterol transport molecules, particularly upregulation of ABCA1 and downregulation of LDLR. Mechanistically, a miR-19a/LDLR axis-mediated novel post-transcriptional regulation of cholesterol uptake is identified, modulated by IDH mutation, and influencing tumor cell proliferation and invasion. IDH mutation-induced PERK activation enhances cholesterol export from glioma cells via the miR-19a/LDLR axis and ABCA1/APOE upregulation. Further, a synthetic PERK activator, CCT020312 is introduced, which markedly stimulates cholesterol efflux from IDH wild-type glioma cells, induces M1-like polarization of GAMs, and consequently suppresses glioma cell invasion. The findings reveal an essential role of the PERK/miR-19a/LDLR signaling pathway in orchestrating gliomal cholesterol transport and the subsequent phenotypes of GAMs, thereby highlighting a novel potential target pathway for glioma therapy

    Secrets of RLHF in Large Language Models Part II: Reward Modeling

    Full text link
    Reinforcement Learning from Human Feedback (RLHF) has become a crucial technology for aligning language models with human values and intentions, enabling models to produce more helpful and harmless responses. Reward models are trained as proxies for human preferences to drive reinforcement learning optimization. While reward models are often considered central to achieving high performance, they face the following challenges in practical applications: (1) Incorrect and ambiguous preference pairs in the dataset may hinder the reward model from accurately capturing human intent. (2) Reward models trained on data from a specific distribution often struggle to generalize to examples outside that distribution and are not suitable for iterative RLHF training. In this report, we attempt to address these two issues. (1) From a data perspective, we propose a method to measure the strength of preferences within the data, based on a voting mechanism of multiple reward models. Experimental results confirm that data with varying preference strengths have different impacts on reward model performance. We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data. (2) From an algorithmic standpoint, we introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses, thereby improving model generalization. Furthermore, we employ meta-learning to enable the reward model to maintain the ability to differentiate subtle differences in out-of-distribution samples, and this approach can be utilized for iterative RLHF optimization
    corecore