75 research outputs found

    SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models

    Full text link
    A common limitation of diagnostic tests for detecting social biases in NLP models is that they may only detect stereotypic associations that are pre-specified by the designer of the test. Since enumerating all possible problematic associations is infeasible, it is likely these tests fail to detect biases that are present in a model but not pre-specified by the designer. To address this limitation, we propose SODAPOP (SOcial bias Discovery from Answers about PeOPle) in social commonsense question-answering. Our pipeline generates modified instances from the Social IQa dataset (Sap et al., 2019) by (1) substituting names associated with different demographic groups, and (2) generating many distractor answers from a masked language model. By using a social commonsense model to score the generated distractors, we are able to uncover the model's stereotypic associations between demographic groups and an open set of words. We also test SODAPOP on debiased models and show the limitations of multiple state-of-the-art debiasing algorithms.Comment: EACL 202

    HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields

    Full text link
    Human hands are highly articulated and versatile at handling objects. Jointly estimating the 3D poses of a hand and the object it manipulates from a monocular camera is challenging due to frequent occlusions. Thus, existing methods often rely on intermediate 3D shape representations to increase performance. These representations are typically explicit, such as 3D point clouds or meshes, and thus provide information in the direct surroundings of the intermediate hand pose estimate. To address this, we introduce HOISDF, a Signed Distance Field (SDF) guided hand-object pose estimation network, which jointly exploits hand and object SDFs to provide a global, implicit representation over the complete reconstruction volume. Specifically, the role of the SDFs is threefold: equip the visual encoder with implicit shape information, help to encode hand-object interactions, and guide the hand and object pose regression via SDF-based sampling and by augmenting the feature representations. We show that HOISDF achieves state-of-the-art results on hand-object pose estimation benchmarks (DexYCB and HO3Dv2). Code is available at https://github.com/amathislab/HOISDFComment: Accepted at CVPR 2024. 9 figures, many table

    Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

    Full text link
    Domain generalization asks for models trained over a set of training environments to generalize well in unseen test environments. Recently, a series of algorithms such as Invariant Risk Minimization (IRM) have been proposed for domain generalization. However, Rosenfeld et al. (2021) shows that in a simple linear data model, even if non-convexity issues are ignored, IRM and its extensions cannot generalize to unseen environments with less than ds+1d_s+1 training environments, where dsd_s is the dimension of the spurious-feature subspace. In this work, we propose Invariant-feature Subspace Recovery (ISR): a new class of algorithms to achieve provable domain generalization across the settings of classification and regression problems. First, in the binary classification setup of Rosenfeld et al. (2021), we show that our first algorithm, ISR-Mean, can identify the subspace spanned by invariant features from the first-order moments of the class-conditional distributions, and achieve provable domain generalization with ds+1d_s+1 training environments. Our second algorithm, ISR-Cov, further reduces the required number of training environments to O(1)O(1) using the information of second-order moments. Notably, unlike IRM, our algorithms bypass non-convexity issues and enjoy global convergence guarantees. Next, we extend ISR-Mean to the more general setting of multi-class classification and propose ISR-Multiclass, which leverages class information and provably recovers the invariant-feature subspace with ⌈ds/k⌉+1\lceil d_s/k\rceil+1 training environments for kk-class classification. Finally, for regression problems, we propose ISR-Regression that can identify the invariant-feature subspace with ds+1d_s+1 training environments. Empirically, we demonstrate the superior performance of our ISRs on synthetic benchmarks. Further, ISR can be used as post-processing methods for feature extractors such as neural nets.Comment: Submitted to JMLR. This journal version significantly extends our ICML 2022 paper, arXiv:2201.1291

    Distantly-Supervised Named Entity Recognition with Uncertainty-aware Teacher Learning and Student-student Collaborative Learning

    Full text link
    Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the burden of annotation, but meanwhile suffers from the label noise. Recent works attempt to adopt the teacher-student framework to gradually refine the training labels and improve the overall robustness. However, we argue that these teacher-student methods achieve limited performance because poor network calibration produces incorrectly pseudo-labeled samples, leading to error propagation. Therefore, we attempt to mitigate this issue by proposing: (1) Uncertainty-aware Teacher Learning that leverages the prediction uncertainty to guide the selection of pseudo-labels, avoiding the number of incorrect pseudo-labels in the self-training stage. (2) Student-student Collaborative Learning that allows the transfer of reliable labels between two student networks instead of completely relying on all pseudo-labels from its teacher. Meanwhile, this approach allows a full exploration of mislabeled samples rather than simply filtering unreliable pseudo-labeled samples. Extensive experimental results on five DS-NER datasets demonstrate that our method is superior to state-of-the-art teacher-student methods

    Multi-view 3D Face Reconstruction Based on Flame

    Full text link
    At present, face 3D reconstruction has broad application prospects in various fields, but the research on it is still in the development stage. In this paper, we hope to achieve better face 3D reconstruction quality by combining multi-view training framework with face parametric model Flame, propose a multi-view training and testing model MFNet (Multi-view Flame Network). We build a self-supervised training framework and implement constraints such as multi-view optical flow loss function and face landmark loss, and finally obtain a complete MFNet. We propose innovative implementations of multi-view optical flow loss and the covisible mask. We test our model on AFLW and facescape datasets and also take pictures of our faces to reconstruct 3D faces while simulating actual scenarios as much as possible, which achieves good results. Our work mainly addresses the problem of combining parametric models of faces with multi-view face 3D reconstruction and explores the implementation of a Flame based multi-view training and testing framework for contributing to the field of face 3D reconstruction

    Biodynamic features Syuantszy Chzhuanti 720°.

    Get PDF
    Presents the internal parameters and image Syuantszy Chzhuanti 720 ° is shown that in the implementation of the element Syuantszy Chzhuanti 720 °, the center of gravity shifts to 2.94 pm, 1.71 m. and 1.22 m. on the X, Y and Z; rate varies according to X - with 4,22 m/s to 0, Y - to 2,42 m/s to 0, and Z - from 3.68 m/s to 3.86 m/s. Run-time item 1.4 seconds: the first turnover - 0.41 sec., The second turnover-0, 33 sec. At the end of the takeoff run strike force left and right foot of 1147.2 N and 1005 N. Pressing the second, third, fourth, fifth finger and part of the metatarsal of right foot maximum intensity of pressure - 146.1 N; when pressing the first finger and part of the metatarsal maximum intensity of pressure - 280.8 N. The dependence of convergence or remove body parts with a vertical axis of the torque to increase or decrease its speed

    Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond

    Full text link
    In this study, we explore the potential of Multimodal Large Language Models (MLLMs) in improving embodied decision-making processes for agents. While Large Language Models (LLMs) have been widely used due to their advanced reasoning skills and vast world knowledge, MLLMs like GPT4-Vision offer enhanced visual understanding and reasoning capabilities. We investigate whether state-of-the-art MLLMs can handle embodied decision-making in an end-to-end manner and whether collaborations between LLMs and MLLMs can enhance decision-making. To address these questions, we introduce a new benchmark called PCA-EVAL, which evaluates embodied decision-making from the perspectives of Perception, Cognition, and Action. Additionally, we propose HOLMES, a multi-agent cooperation framework that allows LLMs to leverage MLLMs and APIs to gather multimodal information for informed decision-making. We compare end-to-end embodied decision-making and HOLMES on our benchmark and find that the GPT4-Vision model demonstrates strong end-to-end embodied decision-making abilities, outperforming GPT4-HOLMES in terms of average decision accuracy (+3%). However, this performance is exclusive to the latest GPT4-Vision model, surpassing the open-source state-of-the-art MLLM by 26%. Our results indicate that powerful MLLMs like GPT4-Vision hold promise for decision-making in embodied agents, offering new avenues for MLLM research. Code and data are open at https://github.com/pkunlp-icler/PCA-EVAL/.Comment: FMDM@NeurIPS2023, Code and data: https://github.com/pkunlp-icler/PCA-EVAL

    Constant real-space fractal dimensionality and structure evolution in Ti62Cu38 metallic glass under high pressure

    Get PDF
    The structure of binary Ti62Cu38 metallic glass is investigated under pressures up to 33.8 GPa using the pair distribution function analysis based on high-energy x-ray scattering and reverse Monte Carlo (RMC) simulations. At a global scale, its relative volume shows a continuously smooth curve as a function of pressure. The isothermal bulk modulus of Ti62Cu38 metallic glass is estimated as B0=132(3)GPa with B0′=5.8(0.4). At a local scale, the atomic packing structure under compression conditions, which is extracted from RMC simulations, shows that the topological short-range order is dominated by the deformed icosahedron polyhedra and basically maintains stable. From the relationship between the relative volume and changing ratio of the atomic separation distances, the real-space fractal dimensionality of this metallic glass is determined as about 2.5 for all of the first four peaks. This experimental result reveals the consistent nature of the fractal feature on the degree of self-similarity in this sample within the entire experimental pressure range

    Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction

    Full text link
    With the rapid growth of cloud computing, a variety of software services have been deployed in the cloud. To ensure the reliability of cloud services, prior studies focus on failure instance (disk, node, and switch, etc.) prediction. Once the output of prediction is positive, mitigation actions are taken to rapidly resolve the underlying failure. According to our real-world practice in Microsoft Azure, we find that the prediction accuracy may decrease by about 9% after retraining the models. Considering that the mitigation actions may result in uncertain positive instances since they cannot be verified after mitigation, which may introduce more noise while updating the prediction model. To the best of our knowledge, we are the first to identify this Uncertain Positive Learning (UPLearning) issue in the real-world cloud failure prediction scenario. To tackle this problem, we design an Uncertain Positive Learning Risk Estimator (Uptake) approach. Using two real-world datasets of disk failure prediction and conducting node prediction experiments in Microsoft Azure, which is a top-tier cloud provider that serves millions of users, we demonstrate Uptake can significantly improve the failure prediction accuracy by 5% on average
    • …
    corecore