13 research outputs found

    Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

    Full text link
    Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others. Although ToM may come naturally to us, emulating it presents a challenge to even the most advanced Large Language Models (LLMs). Recent improvements to LLMs' reasoning capabilities from simple yet effective prompting techniques such as Chain-of-Thought have seen limited applicability to ToM. In this paper, we turn to the prominent cognitive science theory "Simulation Theory" to bridge this gap. We introduce SimToM, a novel two-stage prompting framework inspired by Simulation Theory's notion of perspective-taking. To implement this idea on current ToM benchmarks, SimToM first filters context based on what the character in question knows before answering a question about their mental state. Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods, and our analysis reveals the importance of perspective-taking to Theory-of-Mind capabilities. Our findings suggest perspective-taking as a promising direction for future research into improving LLMs' ToM capabilities

    Face-to-Face Contrastive Learning for Social Intelligence Question-Answering

    Full text link
    Creating artificial social intelligence - algorithms that can understand the nuances of multi-person interactions - is an exciting and emerging challenge in processing facial expressions and gestures from multimodal videos. Recent multimodal methods have set the state of the art on many tasks, but have difficulty modeling the complex face-to-face conversational dynamics across speaking turns in social interaction, particularly in a self-supervised setup. In this paper, we propose Face-to-Face Contrastive Learning (F2F-CL), a graph neural network designed to model social interactions using factorization nodes to contextualize the multimodal face-to-face interaction along the boundaries of the speaking turn. With the F2F-CL model, we propose to perform contrastive learning between the factorization nodes of different speaking turns within the same video. We experimentally evaluated the challenging Social-IQ dataset and show state-of-the-art results

    Difference-Masking: Choosing What to Mask in Continued Pretraining

    Full text link
    The self-supervised objective of masking-and-predicting has led to promising performance gains on a variety of downstream tasks. However, while most approaches randomly mask tokens, there is strong intuition that deciding what to mask can substantially improve learning outcomes. We investigate this in continued pretraining setting in which pretrained models continue to pretrain on domain-specific data before performing some downstream task. We introduce Difference-Masking, a masking strategy that automatically chooses what to mask during continued pretraining by considering what makes a task domain different from the pretraining domain. Empirically, we find that Difference-Masking outperforms baselines on continued pretraining settings across four diverse language-only and multimodal video tasks

    Political Cleavages within Industry: Firm level lobbying for Trade Liberalization ∗

    Get PDF
    [Work in Progress] Existing political economy models rely on inter-industry differences such as factor endowment or factor specificity to explain the politics of trade policy-making. However, this paper finds that a large proportion of variation in U.S. applied tariff rates in fact arises within industry. I offer a theory of trade liberalization that explains how product differentiation in economic markets leads to firm-level lobbying in political markets. I argue that while high product differentiation eliminates the collective action problem exporting firms confront, political objections to product-specific liberalization will decline due to less substitutability and the possibility of serving foreign markets based on the norms of reciprocity. To test this argument, I construct a new dataset on lobbying by all publicly traded manufacturing firms after parsing all 838,588 lobbying reports filed under the Lobbying Disclosure Act of 1995. I find that productive exporting firms are more likely to lobby to reduce tariffs, especially when their products are sufficiently differentiated. I also find that highly differentiated products have lower tariff rates. The results challenge the common focus on industry-level lobbying for protection

    A global sampling approach to designing and reengineering RNA secondary structures

    Get PDF
    The development of algorithms for designing artificial RNA sequences that fold into specific secondary structures has many potential biomedical and synthetic biology applications. To date, this problem remains computationally difficult, and current strategies to address it resort to heuristics and stochastic search techniques. The most popular methods consist of two steps: First a random seed sequence is generated; next, this seed is progressively modified (i.e. mutated) to adopt the desired folding properties. Although computationally inexpensive, this approach raises several questions such as (i) the influence of the seed; and (ii) the efficiency of single-path directed searches that may be affected by energy barriers in the mutational landscape. In this article, we present RNA-ensign, a novel paradigm for RNA design. Instead of taking a progressive adaptive walk driven by local search criteria, we use an efficient global sampling algorithm to examine large regions of the mutational landscape under structural and thermodynamical constraints until a solution is found. When considering the influence of the seeds and the target secondary structures, our results show that, compared to single-path directed searches, our approach is more robust, succeeds more often and generates more thermodynamically stable sequences. An ensemble approach to RNA design is thus well worth pursuing as a complement to existing approaches. RNA-ensign is available at http://csb.cs.mcgill.ca/RNAensign.National Science Foundation (U.S.). Graduate Research Fellowship ProgramNatural Sciences and Engineering Research Council of Canada (NSERC) (RGPIN ) (386596-10)Fonds québécois de la recherche sur la nature et les technologies (PR-146375)National Institutes of Health (U.S.) (Grant GM081871)Natural Sciences and Engineering Research Council of Canada (NSERC)National Institutes of Health (U.S.

    Pursuing Darwin’s curious parallel: Prospects for a science of cultural evolution

    No full text
    This is the final published versionAlso available from NAS via the DOI in this recordIn the past few decades, scholars from several disciplines have pursued the curious parallel noted by Darwin between the genetic evolution of species and the cultural evolution of beliefs, skills, knowledge, languages, institutions, and other forms of socially transmitted information. Here, I review current progress in the pursuit of an evolutionary science of culture that is grounded in both biological and evolutionary theory, but also treats culture as more than a proximate mechanism that is directly controlled by genes. Both genetic and cultural evolution can be described as systems of inherited variation that change over time in response to processes such as selection, migration, and drift. Appropriate differences between genetic and cultural change are taken seriously, such as the possibility in the latter of nonrandomly guided variation or transformation, blending inheritance, and one-to-many transmission. The foundation of cultural evolution was laid in the late 20th century with population-genetic style models of cultural microevolution, and the use of phylogenetic methods to reconstruct cultural macroevolution. Since then, there have been major efforts to understand the sociocognitive mechanisms underlying cumulative cultural evolution, the consequences of demography on cultural evolution, the empirical validity of assumed social learning biases, the relative role of transformative and selective processes, and the use of quantitative phylogenetic and multilevel selection models to understand past and present dynamics of society-level change. I conclude by highlighting the interdisciplinary challenges of studying cultural evolution, including its relation to the traditional social sciences and humanities

    Credibility and Distributional Effects of International Banking Regulations: Evidence from US Bank Stock Returns

    No full text
    Abstract Financial regulatory networks are a pervasive, new type of global governance heralded by some as a flexible answer to globalization dilemmas and dismissed by others as ineffective due to weak enforcement mechanisms. Whether regulatory network agreements provide global public goods or private goods for certain states' firms is a second debated issue. This paper adjudicates among competing perspectives by examining whether Basel III, an international agreement negotiated by the bank regulatory network about bank capital minimums in 2009 and 2010, was viewed as credible and affecting regulated US firms. I use stock returns to measure investors' perceptions, and an event study methodology to test whether regulated banks' observed stock returns significantly differ from expected stock returns on days when new information about Basel III becomes available. If the agreement is viewed as credible and affecting firm value, banks' stock returns will deviate from expectations. The direction of any deviation indicates whether regulations benefit or hurt banks. While the direction of effects is not uniform across events, I find that the initial stock return reaction and the net effect across all five events are negative, indicating that US banks were not helped by new international regulations. Further, US banks experienced stock returns that differed from expectations, providing evidence that international regulatory network agreements are viewed as credible and tangibly affect firms independent of domestic implementation. * Author is a Ph.D. candidate in the Department of Politics, Princeton University (email: [email protected], website: meredithwilf.webs.com). Many thanks to Christina Davis and Kosuke Imai for numerous iterations of comments, to Helen Milner for comments on earlier drafts and for the suggestion to look at the winners and losers of Basel III, and to Marc Ratkovic for assistance learning Lasso and for comments. For helpful comments and suggestions, thanks t