19 research outputs found

    Interactive Visual Reasoning under Uncertainty

    Full text link
    One of the fundamental cognitive abilities of humans is to quickly resolve uncertainty by generating hypotheses and testing them via active trials. Encountering a novel phenomenon accompanied by ambiguous cause-effect relationships, humans make hypotheses against data, conduct inferences from observation, test their theory via experimentation, and correct the proposition if inconsistency arises. These iterative processes persist until the underlying mechanism becomes clear. In this work, we devise the IVRE (pronounced as "ivory") environment for evaluating artificial agents' reasoning ability under uncertainty. IVRE is an interactive environment featuring rich scenarios centered around Blicket detection. Agents in IVRE are placed into environments with various ambiguous action-effect pairs and asked to determine each object's role. They are encouraged to propose effective and efficient experiments to validate their hypotheses based on observations and actively gather new information. The game ends when all uncertainties are resolved or the maximum number of trials is consumed. By evaluating modern artificial agents in IVRE, we notice a clear failure of today's learning methods compared to humans. Such inefficacy in interactive reasoning ability under uncertainty calls for future research in building human-like intelligence.Comment: Accepted at NeurIPS 2023 (Datasets and Benchmarks

    Evaluating and Inducing Personality in Pre-trained Language Models

    Full text link
    Standardized and quantified evaluation of machine behaviors is a crux of understanding LLMs. In this study, we draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors. Originating as a philosophical quest for human behaviors, the study of personality delves into how individuals differ in thinking, feeling, and behaving. Toward building and understanding human-like social machines, we are motivated to ask: Can we assess machine behaviors by leveraging human psychometric tests in a principled and quantitative manner? If so, can we induce a specific personality in LLMs? To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors; MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories. By systematically evaluating LLMs with MPI, we provide the first piece of evidence demonstrating the efficacy of MPI in studying LLMs behaviors. We further devise a Personality Prompting (P^2) method to induce LLMs with specific personalities in a controllable way, capable of producing diverse and verifiable behaviors. We hope this work sheds light on future studies by adopting personality as the essential indicator for various downstream tasks, and could further motivate research into equally intriguing human-like machine behaviors.Comment: Accepted at NeurIPS 2023 (Spotlight

    Hepatic epithelioid hemangioendothelioma—a single-institution experience with 51 cases

    Get PDF
    ObjectivesThe aim of the present study was to describe the experience at a single institution in the management of hepatic epithelioid hemangioendothelioma (HEHE).MethodsWe included 51 patients with histologically confirmed HEHE. We performed log-rank (Cox–Mantel) survival analyses using Kaplan–Meier methods to test differences in survival between patients in different groups. Univariate Cox regression analyses and multivariate proportional hazards regression model were carried out to identify independent prognostic factors.ResultsDifferent imaging modalities were used to diagnose HEHE with various presentations. Liver resection (LR), liver transplantation (LT), systemic treatment (ST), and surveillance had been used in our study. A significant difference was noted between the LR group and the surveillance group with respect to mean survival (p = 0.006), as was in the LR group and the ST group (p = 0.036), and in surgical approach (LR and LT) and nonsurgical approach (ST and surveillance) (p = 0.008). The mean survival between the ST group and the surveillance group was not significantly different (p = 0.851). LR (p = 0.010) and surgical approach (p = 0.014) were favorable predictors of outcome, while macrovascular invasion (MaVI) (p = 0.037), lung metastasis (p = 0.040), and surveillance (p = 0.033) were poor prognostic factors in univariate analysis. Multivariate analysis showed that LR (p = 0.010) and surgical approach (p = 0.014) were independently associated with good OS, while surveillance (p = 0.033) was independently associated with poor OS. After adjusting for confounding factors, patients in the LR group have much better OS than those in the surveillance group (p = 0.013). However, there was no significant difference in OS between the LR group and ST group (p = 0.254), as was in the ST group and the surveillance group (p = 0.857).ConclusionsThe definitive diagnosis of HEHE was dependent on histopathology, and it was not possible to make a specific diagnosis without biopsy because the radiological findings were similar to those in some hepatic malignancies. ST was not recommended for patients who were not candidates for surgical approaches, and surgical approaches should be warranted regardless of disease stage. The retrospective nature and the small size of the data limited the generalizability of the study, designing a worldwide database that contains all data about patients with HEHE independent of their therapy, which was highly recommended

    MEWL: Few-shot multimodal word learning with referential uncertainty

    Full text link
    Without explicit feedback, humans can rapidly learn the meaning of words. Children can acquire a new word after just a few passive exposures, a process known as fast mapping. This word learning capability is believed to be the most fundamental building block of multimodal understanding and reasoning. Despite recent advancements in multimodal learning, a systematic and rigorous evaluation is still missing for human-like word learning in machines. To fill in this gap, we introduce the MachinE Word Learning (MEWL) benchmark to assess how machines learn word meaning in grounded visual scenes. MEWL covers human's core cognitive toolkits in word learning: cross-situational reasoning, bootstrapping, and pragmatic learning. Specifically, MEWL is a few-shot benchmark suite consisting of nine tasks for probing various word learning capabilities. These tasks are carefully designed to be aligned with the children's core abilities in word learning and echo the theories in the developmental literature. By evaluating multimodal and unimodal agents' performance with a comparative analysis of human performance, we notice a sharp divergence in human and machine word learning. We further discuss these differences between humans and machines and call for human-like few-shot word learning in machines.Comment: Accepted at ICML 202

    On the Complexity of Bayesian Generalization

    Full text link
    We consider concept generalization at a large scale in the diverse and natural visual spectrum. Established computational modes (i.e., rule-based or similarity-based) are primarily studied isolated and focus on confined and abstract problem spaces. In this work, we study these two modes when the problem space scales up, and the complexitycomplexity of concepts becomes diverse. Specifically, at the representational levelrepresentational \ level, we seek to answer how the complexity varies when a visual concept is mapped to the representation space. Prior psychology literature has shown that two types of complexities (i.e., subjective complexity and visual complexity) (Griffiths and Tenenbaum, 2003) build an inverted-U relation (Donderi, 2006; Sun and Firestone, 2021). Leveraging Representativeness of Attribute (RoA), we computationally confirm the following observation: Models use attributes with high RoA to describe visual concepts, and the description length falls in an inverted-U relation with the increment in visual complexity. At the computational levelcomputational \ level, we aim to answer how the complexity of representation affects the shift between the rule- and similarity-based generalization. We hypothesize that category-conditioned visual modeling estimates the co-occurrence frequency between visual and categorical attributes, thus potentially serving as the prior for the natural visual world. Experimental results show that representations with relatively high subjective complexity outperform those with relatively low subjective complexity in the rule-based generalization, while the trend is the opposite in the similarity-based generalization

    Menin–MLL1 Interaction Small Molecule Inhibitors: A Potential Therapeutic Strategy for Leukemia and Cancers

    No full text
    Encoded by the MEN1 gene, menin protein is a fusion protein that is essential for the oncogenic transformation of mixed-lineage leukemia (MLL) and leads to acute leukemia (AL). Therefore, accumulating evidence has demonstrated that inhibition of the high-affinity relationship between menin and mixed-lineage leukemia 1 (MLL1 and KMT2A) is an effective treatment for MLL-rearranged (MLL-r) leukemia in vitro and in vivo. Meanwhile, recent studies found that menin–MLL1 interaction inhibitors exhibited a firm tumor suppressive ability in specific cancer cells, such as prostate cancer, breast cancer, liver cancer, and lung cancer. Overall, it seems to serve as a novel therapeutic means for cancers. Herein, we review the recent progress in exploring the inhibitors of small molecule menin–MLL1 interactions. The molecular mechanisms of these inhibitors’ functions and their application prospects in the treatment of AL and cancers are explored

    Artificial Social Intelligence: A Comparative and Holistic View

    Get PDF
    In addition to a physical comprehension of the world, humans possess a high social intelligence—the intelligence that senses social events, infers the goals and intents of others, and facilitates social interaction. Notably, humans are distinguished from their closest primate cousins by their social cognitive skills as opposed to their physical counterparts. We believe that artificial social intelligence (ASI) will play a crucial role in shaping the future of artificial intelligence (AI). This article begins with a review of ASI from a cognitive science standpoint, including social perception, theory of mind (ToM), and social interaction. Next, we examine the recently-emerged computational counterpart in the AI community. Finally, we provide an in-depth discussion on topics related to ASI
    corecore