106 research outputs found

    DragAPart: learning a part-level motion prior for articulated objects

    Get PDF
    We introduce DragAPart, a method that, given an image and a set of drags as input, generates a new image of the same object that responds to the action of the drags. Differently from prior works that focused on repositioning objects, DragAPart predicts part-level interactions, such as opening and closing a drawer. We study this problem as a proxy for learning a generalist motion model, not restricted to a specific kinematic structure or object category. We start from a pre-trained image generator and fine-tune it on a new synthetic dataset, Drag-aMove, which we introduce. Combined with a new encoding for the drags and dataset randomization, the model generalizes well to real images and different categories. Compared to prior motion-controlled generators, we demonstrate much better part-level motion understanding

    Farm3D: learning articulated 3D animals by distilling 2D diffusion

    Get PDF
    We present Farm3D, a method to learn category-specific 3D reconstructors for articulated objects entirely from ā€œfreeā€ virtual supervision from a pre-trained 2D diffusionbased image generator. Recent approaches can learn, given a collection of single-view images of an object category, a monocular network to predict the 3D shape, albedo, illumination and viewpoint of any object occurrence. We propose a framework using an image generator like Stable Diffusion to generate virtual training data for learning such a reconstruction network from scratch. Furthermore, we include the diffusion model as a score to further improve learning. The idea is to randomise some aspects of the reconstruction, such as viewpoint and illumination, generating synthetic views of the reconstructed 3D object, and have the 2D network assess the quality of the resulting image, providing feedback to the reconstructor. Different from work based on distillation which produces a single 3D asset for each textual prompt in hours, our approach produces a monocular reconstruction network that can output a controllable 3D asset from a given image, real or generated, in only seconds. Our network can be used for analysis, including monocular reconstruction, or for synthesis, generating articulated assets for real-time applications such as video games

    MagicPony: Learning Articulated 3D Animals in the Wild

    Full text link
    We consider the problem of learning a function that can estimate the 3D shape, articulation, viewpoint, texture, and lighting of an articulated animal like a horse, given a single test image. We present a new method, dubbed MagicPony, that learns this function purely from in-the-wild single-view images of the object category, with minimal assumptions about the topology of deformation. At its core is an implicit-explicit representation of articulated shape and appearance, combining the strengths of neural fields and meshes. In order to help the model understand an object's shape and pose, we distil the knowledge captured by an off-the-shelf self-supervised vision transformer and fuse it into the 3D model. To overcome common local optima in viewpoint estimation, we further introduce a new viewpoint sampling scheme that comes at no added training cost. Compared to prior works, we show significant quantitative and qualitative improvements on this challenging task. The model also demonstrates excellent generalisation in reconstructing abstract drawings and artefacts, despite the fact that it is only trained on real images.Comment: Project Page: https://3dmagicpony.github.io

    Reconstructing the First Metatarsophalangeal Joint of Homo naledi

    Get PDF
    The aim of the present study was to develop a new method to reconstruct damaged metatarsophalangeal joint (MTPJ) of Homo naledi's fossil and to deepen the understanding of the first metatarsal head (FMH) morphological adaptation in different gait patterns. To this purpose three methods were introduced. The first served to compare the anthropometric linear and volumetric measurements of Homo naledi's MTPJ to that of 10 various athletes. The second was employed to measure curvature diameter in FMH's medial and lateral grooves for sesamoid bones. The third was used to determine the parallelism between medial and lateral FMH grooves. The anthropometric measurements of middle-distance runner to the greatest extent mimicked that of Homo naledi. Thus, it was used to successfully reconstruct the damaged Homo naledi's MTPJ. The highest curvature diameter of medial FMH groove was found in Homo naledi, while in lateral FMH groove it was the highest in volleyball player, suggesting their increased bear loading. The parallelism of medial and lateral FMH grooves was observed only in Homo naledi, while in investigated athletes it was dis-parallel. Athletes' dis-paralleled structures make first MTPJ simple flexion movement a complicated one: not rotating about one axis, but about many, which may result in bringing a negative effect on running. In conclusion, the presented method for the reconstruction of the damaged foot bone paves the way for morphological and structural analysis of modern population and fossil hominins' gait pattern

    Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

    Full text link
    Large language models with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction - a classic task in natural language processing - most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users. Our task aims to follow the instructions to extract the desired content from the associated text and present it in a structured tabular format. The table headers can either be user-specified or inferred contextually by the model. To facilitate research in this emerging area, we present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE. Comprehensive evaluations on our benchmark reveal that ODIE substantially outperforms the existing open-source models of similar size. Our code and dataset are released on https://github.com/yzjiao/On-Demand-IE.Comment: EMNLP 202

    Learning the 3D Fauna of the Web

    Full text link
    Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by simply learning from 2D Internet images. We show that prior category-specific attempts fail to generalize to rare species with limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM), which automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf self-supervised feature extractor. To train such a model, we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any quadruped animal, our model reconstructs an articulated 3D mesh in a feed-forward fashion within seconds.Comment: The first two authors contributed equally to this work. The last three authors contributed equally. Project page: https://kyleleey.github.io/3DFauna

    Learning the 3D fauna of the web

    Get PDF
    Learning 3D models of all animals in nature requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by learning our model from 2D Internet images. We show that prior approaches, which are category-specific, fail to generalize to rare species with limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM), which automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf self-supervised feature extractor. To train such a model, we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any quadruped animal, our model reconstructs an articulated 3D mesh in a feed-forward manner in seconds

    Learning the 3D fauna of the web

    Get PDF
    Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by simply learning from 2D Internet images. We show that prior category-specific attempts fail to generalize to rare species with limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM), which automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf self-supervised feature extractor. To train such a model, we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any quadruped animal, our model reconstructs an articulated 3D mesh in a feed-forward fashion within seconds

    Impact of pesticide outsourcing services on farmersā€™ low-carbon production behavior

    Get PDF
    Introduction: Promoting low-carbon development in agriculture is crucial for achieving agricultural modernization. One practical issue worth studying is whether outsourcing services can encourage farmers to adopt low-carbon production practices. This study analyzes the impact of pesticide outsourcing services on the low-carbon production behavior of farmers to provide China with practical recommendations.Methods: This empirical study investigates the impact of pesticide outsourcing services on farmersā€™ low-carbon production behavior using survey data from 450 rice growers in the Ningxia and Shaanxi provinces by endogenous switching regressions (ESR) model.Results and Discussion: Results showed that 1) outsourcing services have a significant negative impact on farmersā€™ manual weeding behavior, leading to a reduction in the frequency of manual weeding; 2) outsourcing services have a significant positive impact on farmersā€™ herbicide application behavior. In other words, participation in outsourcing leads to excessive pesticide application; 3) outsourcing services do not support a green and low-carbon production model where manual weeding replaces herbicide application. Due to the imperfect development of the outsourcing market in China, especially in the northwest region, the construction of outsourcing service system is lagging, and it is difficult for non-professional outsourcing services to play a driving role in green and low-carbon production for farmers, who will often choose the lower-cost mechanical application for maximum profit. The policy implication of this study is the need for a comprehensive and objective understanding of the impact and role of pesticide outsourcing services on farmersā€™ low-carbon production behavior. This understanding can help improve the market, policy, and other external environments for farmers to participate in outsourcing, ultimately promoting the sustainable development of green and low-carbon agriculture. This paper adds to the discussion of pesticide outsourcing services and farmersā€™ low-carbon production by drawing different conclusions from previous studies, providing a fresh foundation for policy-making
    • ā€¦
    corecore