250 research outputs found

    Online Robot Introspection via Wrench-based Action Grammars

    Full text link
    Robotic failure is all too common in unstructured robot tasks. Despite well-designed controllers, robots often fail due to unexpected events. How do robots measure unexpected events? Many do not. Most robots are driven by the sense-plan act paradigm, however more recently robots are undergoing a sense-plan-act-verify paradigm. In this work, we present a principled methodology to bootstrap online robot introspection for contact tasks. In effect, we are trying to enable the robot to answer the question: what did I do? Is my behavior as expected or not? To this end, we analyze noisy wrench data and postulate that the latter inherently contains patterns that can be effectively represented by a vocabulary. The vocabulary is generated by segmenting and encoding the data. When the wrench information represents a sequence of sub-tasks, we can think of the vocabulary forming a sentence (set of words with grammar rules) for a given sub-task; allowing the latter to be uniquely represented. The grammar, which can also include unexpected events, was classified in offline and online scenarios as well as for simulated and real robot experiments. Multiclass Support Vector Machines (SVMs) were used offline, while online probabilistic SVMs were are used to give temporal confidence to the introspection result. The contribution of our work is the presentation of a generalizable online semantic scheme that enables a robot to understand its high-level state whether nominal or abnormal. It is shown to work in offline and online scenarios for a particularly challenging contact task: snap assemblies. We perform the snap assembly in one-arm simulated and real one-arm experiments and a simulated two-arm experiment. This verification mechanism can be used by high-level planners or reasoning systems to enable intelligent failure recovery or determine the next most optima manipulation skill to be used.Comment: arXiv admin note: substantial text overlap with arXiv:1609.0494

    A Novel Prognostic Predictor of Immune Micro-environment and Therapeutic Response in Kidney Renal Clear Cell Carcinoma based on Necroptosis-related Gene Signature

    Get PDF
    Background: Necroptosis, a cell death of caspase-independence, plays a pivotal role in cancer biological regulation. Although necroptosis is closely associated with oncogenesis, cancer metastasis, and immunity, there remains a lack of studies determining the role of necroptosis-related genes (NRGs) in the highly immunogenic cancer type, kidney renal clear cell carcinoma (KIRC). Methods: The information of clinicopathology and transcriptome was extracted from TCGA database. Following the division into the train and test cohorts, a three-NRGs (TLR3, FASLG, ZBP1) risk model was identified in train cohort by LASSO regression. The overall survival (OS) comparison was conducted between different risk groups through Kaplan-Meier analysis, which was further validated in test cohort. The Cox proportional hazards regression model was introduced to assess its impact of clinicopathological factors and risk score on survival. ESTIMATE and CIBERSORT algorithms were introduced to evaluate immune microenvironment, while enrichment analysis was conducted to explore the biological significance. Correlation analysis was applied for the correlation assessment between checkpoint gene expression and risk score, between gene expression and therapeutic response. Gene expressions from TCGA were verified by GEO datasets and immunohistochemistry (IHC) analysis. Results: This NRGs-related signature predicted poorer OS in high-risk group, which was also verified in test cohort. Risk score could also independently predict survival outcome of KIRC. Significant changes were also found in immune microenvironment and checkpoint gene expressions between different risk groups, with immune functional enrichment in high-risk group. Interestingly, therapeutic response was correlated with the expressions of NRGs. The expressions of NRGs from TCGA were consistent with those from GEO datasets and IHC analysis. Conclusion: The NRGs-related signature functions as a novel prognostic predictor of immune microenvironment and therapeutic response in KIRC

    Research into Satisfaction with Industrial Heritage Renewal Based on the SEM-IPA Model: A Case Study of the Dongguan Jianyuzhou Park

    Get PDF
    The objective of this study was to establish a satisfaction evaluation model of industrial heritage renewal by evaluating the performance of old industrial buildings after renewal, then judging the adaptability and accuracy of the transformation strategy and putting forward targeted improvement suggestions. The data were collected through a questionnaire survey and analyzed by the statistical package for social science (SPSS) software using a structural equation model (SEM) and importance–performance analysis (IPA). The results show that (1) history, culture, and placeness have the largest average satisfaction and a positive and significant impact on overall satisfaction; (2) the building regeneration effect has a significant impact on overall satisfaction; and (3) social influence, external environmental perception, and economic benefits have no significant effect on overall satisfaction, indicating the need to strengthen public participation for further research into environmental creation, market-based operation, and land conservation. This case study reflects the distinctive features of internationalization and localization, and provides an indication of local practice in China for world urban renewal and architectural heritage theories

    Fake Alignment: Are LLMs Really Aligned Well?

    Full text link
    The growing awareness of safety concerns in large language models (LLMs) has sparked considerable interest in the evaluation of safety within current research endeavors. This study investigates an interesting issue pertaining to the evaluation of LLMs, namely the substantial discrepancy in performance between multiple-choice questions and open-ended questions. Inspired by research on jailbreak attack patterns, we argue this is caused by mismatched generalization. That is, the LLM does not have a comprehensive understanding of the complex concept of safety. Instead, it only remembers what to answer for open-ended safety questions, which makes it unable to solve other forms of safety tests. We refer to this phenomenon as fake alignment and construct a comparative benchmark to empirically verify its existence in LLMs. Such fake alignment renders previous evaluation protocols unreliable. To address this, we introduce the Fake alIgNment Evaluation (FINE) framework and two novel metrics--Consistency Score (CS) and Consistent Safety Score (CSS), which jointly assess two complementary forms of evaluation to quantify fake alignment and obtain corrected performance estimates. Applying FINE to 14 widely-used LLMs reveals several models with purported safety are poorly aligned in practice. Our work highlights potential limitations in prevailing alignment methodologies

    Evaluating Hallucinations in Chinese Large Language Models

    Full text link
    In this paper, we establish a benchmark named HalluQA (Chinese Hallucination Question-Answering) to measure the hallucination phenomenon in Chinese large language models. HalluQA contains 450 meticulously designed adversarial questions, spanning multiple domains, and takes into account Chinese historical culture, customs, and social phenomena. During the construction of HalluQA, we consider two types of hallucinations: imitative falsehoods and factual errors, and we construct adversarial samples based on GLM-130B and ChatGPT. For evaluation, we design an automated evaluation method using GPT-4 to judge whether a model output is hallucinated. We conduct extensive experiments on 24 large language models, including ERNIE-Bot, Baichuan2, ChatGLM, Qwen, SparkDesk and etc. Out of the 24 models, 18 achieved non-hallucination rates lower than 50%. This indicates that HalluQA is highly challenging. We analyze the primary types of hallucinations in different types of models and their causes. Additionally, we discuss which types of hallucinations should be prioritized for different types of models.Comment: Work in progres
    • …
    corecore