19 research outputs found
Interactive Visual Reasoning under Uncertainty
One of the fundamental cognitive abilities of humans is to quickly resolve
uncertainty by generating hypotheses and testing them via active trials.
Encountering a novel phenomenon accompanied by ambiguous cause-effect
relationships, humans make hypotheses against data, conduct inferences from
observation, test their theory via experimentation, and correct the proposition
if inconsistency arises. These iterative processes persist until the underlying
mechanism becomes clear. In this work, we devise the IVRE (pronounced as
"ivory") environment for evaluating artificial agents' reasoning ability under
uncertainty. IVRE is an interactive environment featuring rich scenarios
centered around Blicket detection. Agents in IVRE are placed into environments
with various ambiguous action-effect pairs and asked to determine each object's
role. They are encouraged to propose effective and efficient experiments to
validate their hypotheses based on observations and actively gather new
information. The game ends when all uncertainties are resolved or the maximum
number of trials is consumed. By evaluating modern artificial agents in IVRE,
we notice a clear failure of today's learning methods compared to humans. Such
inefficacy in interactive reasoning ability under uncertainty calls for future
research in building human-like intelligence.Comment: Accepted at NeurIPS 2023 (Datasets and Benchmarks
Evaluating and Inducing Personality in Pre-trained Language Models
Standardized and quantified evaluation of machine behaviors is a crux of
understanding LLMs. In this study, we draw inspiration from psychometric
studies by leveraging human personality theory as a tool for studying machine
behaviors. Originating as a philosophical quest for human behaviors, the study
of personality delves into how individuals differ in thinking, feeling, and
behaving. Toward building and understanding human-like social machines, we are
motivated to ask: Can we assess machine behaviors by leveraging human
psychometric tests in a principled and quantitative manner? If so, can we
induce a specific personality in LLMs? To answer these questions, we introduce
the Machine Personality Inventory (MPI) tool for studying machine behaviors;
MPI follows standardized personality tests, built upon the Big Five Personality
Factors (Big Five) theory and personality assessment inventories. By
systematically evaluating LLMs with MPI, we provide the first piece of evidence
demonstrating the efficacy of MPI in studying LLMs behaviors. We further devise
a Personality Prompting (P^2) method to induce LLMs with specific personalities
in a controllable way, capable of producing diverse and verifiable behaviors.
We hope this work sheds light on future studies by adopting personality as the
essential indicator for various downstream tasks, and could further motivate
research into equally intriguing human-like machine behaviors.Comment: Accepted at NeurIPS 2023 (Spotlight
Hepatic epithelioid hemangioendothelioma—a single-institution experience with 51 cases
ObjectivesThe aim of the present study was to describe the experience at a single institution in the management of hepatic epithelioid hemangioendothelioma (HEHE).MethodsWe included 51 patients with histologically confirmed HEHE. We performed log-rank (Cox–Mantel) survival analyses using Kaplan–Meier methods to test differences in survival between patients in different groups. Univariate Cox regression analyses and multivariate proportional hazards regression model were carried out to identify independent prognostic factors.ResultsDifferent imaging modalities were used to diagnose HEHE with various presentations. Liver resection (LR), liver transplantation (LT), systemic treatment (ST), and surveillance had been used in our study. A significant difference was noted between the LR group and the surveillance group with respect to mean survival (p = 0.006), as was in the LR group and the ST group (p = 0.036), and in surgical approach (LR and LT) and nonsurgical approach (ST and surveillance) (p = 0.008). The mean survival between the ST group and the surveillance group was not significantly different (p = 0.851). LR (p = 0.010) and surgical approach (p = 0.014) were favorable predictors of outcome, while macrovascular invasion (MaVI) (p = 0.037), lung metastasis (p = 0.040), and surveillance (p = 0.033) were poor prognostic factors in univariate analysis. Multivariate analysis showed that LR (p = 0.010) and surgical approach (p = 0.014) were independently associated with good OS, while surveillance (p = 0.033) was independently associated with poor OS. After adjusting for confounding factors, patients in the LR group have much better OS than those in the surveillance group (p = 0.013). However, there was no significant difference in OS between the LR group and ST group (p = 0.254), as was in the ST group and the surveillance group (p = 0.857).ConclusionsThe definitive diagnosis of HEHE was dependent on histopathology, and it was not possible to make a specific diagnosis without biopsy because the radiological findings were similar to those in some hepatic malignancies. ST was not recommended for patients who were not candidates for surgical approaches, and surgical approaches should be warranted regardless of disease stage. The retrospective nature and the small size of the data limited the generalizability of the study, designing a worldwide database that contains all data about patients with HEHE independent of their therapy, which was highly recommended
MEWL: Few-shot multimodal word learning with referential uncertainty
Without explicit feedback, humans can rapidly learn the meaning of words.
Children can acquire a new word after just a few passive exposures, a process
known as fast mapping. This word learning capability is believed to be the most
fundamental building block of multimodal understanding and reasoning. Despite
recent advancements in multimodal learning, a systematic and rigorous
evaluation is still missing for human-like word learning in machines. To fill
in this gap, we introduce the MachinE Word Learning (MEWL) benchmark to assess
how machines learn word meaning in grounded visual scenes. MEWL covers human's
core cognitive toolkits in word learning: cross-situational reasoning,
bootstrapping, and pragmatic learning. Specifically, MEWL is a few-shot
benchmark suite consisting of nine tasks for probing various word learning
capabilities. These tasks are carefully designed to be aligned with the
children's core abilities in word learning and echo the theories in the
developmental literature. By evaluating multimodal and unimodal agents'
performance with a comparative analysis of human performance, we notice a sharp
divergence in human and machine word learning. We further discuss these
differences between humans and machines and call for human-like few-shot word
learning in machines.Comment: Accepted at ICML 202
On the Complexity of Bayesian Generalization
We consider concept generalization at a large scale in the diverse and
natural visual spectrum. Established computational modes (i.e., rule-based or
similarity-based) are primarily studied isolated and focus on confined and
abstract problem spaces. In this work, we study these two modes when the
problem space scales up, and the of concepts becomes diverse.
Specifically, at the , we seek to answer how the
complexity varies when a visual concept is mapped to the representation space.
Prior psychology literature has shown that two types of complexities (i.e.,
subjective complexity and visual complexity) (Griffiths and Tenenbaum, 2003)
build an inverted-U relation (Donderi, 2006; Sun and Firestone, 2021).
Leveraging Representativeness of Attribute (RoA), we computationally confirm
the following observation: Models use attributes with high RoA to describe
visual concepts, and the description length falls in an inverted-U relation
with the increment in visual complexity. At the , we aim
to answer how the complexity of representation affects the shift between the
rule- and similarity-based generalization. We hypothesize that
category-conditioned visual modeling estimates the co-occurrence frequency
between visual and categorical attributes, thus potentially serving as the
prior for the natural visual world. Experimental results show that
representations with relatively high subjective complexity outperform those
with relatively low subjective complexity in the rule-based generalization,
while the trend is the opposite in the similarity-based generalization
Menin–MLL1 Interaction Small Molecule Inhibitors: A Potential Therapeutic Strategy for Leukemia and Cancers
Encoded by the MEN1 gene, menin protein is a fusion protein that is essential for the oncogenic transformation of mixed-lineage leukemia (MLL) and leads to acute leukemia (AL). Therefore, accumulating evidence has demonstrated that inhibition of the high-affinity relationship between menin and mixed-lineage leukemia 1 (MLL1 and KMT2A) is an effective treatment for MLL-rearranged (MLL-r) leukemia in vitro and in vivo. Meanwhile, recent studies found that menin–MLL1 interaction inhibitors exhibited a firm tumor suppressive ability in specific cancer cells, such as prostate cancer, breast cancer, liver cancer, and lung cancer. Overall, it seems to serve as a novel therapeutic means for cancers. Herein, we review the recent progress in exploring the inhibitors of small molecule menin–MLL1 interactions. The molecular mechanisms of these inhibitors’ functions and their application prospects in the treatment of AL and cancers are explored
Artificial Social Intelligence: A Comparative and Holistic View
In addition to a physical comprehension of the world, humans possess a high social intelligence—the intelligence that senses social events, infers the goals and intents of others, and facilitates social interaction. Notably, humans are distinguished from their closest primate cousins by their social cognitive skills as opposed to their physical counterparts. We believe that artificial social intelligence (ASI) will play a crucial role in shaping the future of artificial intelligence (AI). This article begins with a review of ASI from a cognitive science standpoint, including social perception, theory of mind (ToM), and social interaction. Next, we examine the recently-emerged computational counterpart in the AI community. Finally, we provide an in-depth discussion on topics related to ASI