106 research outputs found
DragAPart: learning a part-level motion prior for articulated objects
We introduce DragAPart, a method that, given an image
and a set of drags as input, generates a new image of the same object
that responds to the action of the drags. Differently from prior works
that focused on repositioning objects, DragAPart predicts part-level interactions, such as opening and closing a drawer. We study this problem
as a proxy for learning a generalist motion model, not restricted to a specific kinematic structure or object category. We start from a pre-trained
image generator and fine-tune it on a new synthetic dataset, Drag-aMove, which we introduce. Combined with a new encoding for the drags
and dataset randomization, the model generalizes well to real images and
different categories. Compared to prior motion-controlled generators, we
demonstrate much better part-level motion understanding
Farm3D: learning articulated 3D animals by distilling 2D diffusion
We present Farm3D, a method to learn category-specific 3D reconstructors for articulated objects entirely from āfreeā virtual supervision from a pre-trained 2D diffusionbased image generator. Recent approaches can learn, given a collection of single-view images of an object category, a monocular network to predict the 3D shape, albedo, illumination and viewpoint of any object occurrence. We propose a framework using an image generator like Stable Diffusion to generate virtual training data for learning such a reconstruction network from scratch. Furthermore, we include the diffusion model as a score to further improve learning. The idea is to randomise some aspects of the reconstruction, such as viewpoint and illumination, generating synthetic views of the reconstructed 3D object, and have the 2D network assess the quality of the resulting image, providing feedback to the reconstructor. Different from work based on distillation which produces a single 3D asset for each textual prompt in hours, our approach produces a monocular reconstruction network that can output a controllable 3D asset from a given image, real or generated, in only seconds. Our network can be used for analysis, including monocular reconstruction, or for synthesis, generating articulated assets for real-time applications such as video games
MagicPony: Learning Articulated 3D Animals in the Wild
We consider the problem of learning a function that can estimate the 3D
shape, articulation, viewpoint, texture, and lighting of an articulated animal
like a horse, given a single test image. We present a new method, dubbed
MagicPony, that learns this function purely from in-the-wild single-view images
of the object category, with minimal assumptions about the topology of
deformation. At its core is an implicit-explicit representation of articulated
shape and appearance, combining the strengths of neural fields and meshes. In
order to help the model understand an object's shape and pose, we distil the
knowledge captured by an off-the-shelf self-supervised vision transformer and
fuse it into the 3D model. To overcome common local optima in viewpoint
estimation, we further introduce a new viewpoint sampling scheme that comes at
no added training cost. Compared to prior works, we show significant
quantitative and qualitative improvements on this challenging task. The model
also demonstrates excellent generalisation in reconstructing abstract drawings
and artefacts, despite the fact that it is only trained on real images.Comment: Project Page: https://3dmagicpony.github.io
Reconstructing the First Metatarsophalangeal Joint of Homo naledi
The aim of the present study was to develop a new method to reconstruct damaged metatarsophalangeal joint (MTPJ) of Homo naledi's fossil and to deepen the understanding of the first metatarsal head (FMH) morphological adaptation in different gait patterns. To this purpose three methods were introduced. The first served to compare the anthropometric linear and volumetric measurements of Homo naledi's MTPJ to that of 10 various athletes. The second was employed to measure curvature diameter in FMH's medial and lateral grooves for sesamoid bones. The third was used to determine the parallelism between medial and lateral FMH grooves. The anthropometric measurements of middle-distance runner to the greatest extent mimicked that of Homo naledi. Thus, it was used to successfully reconstruct the damaged Homo naledi's MTPJ. The highest curvature diameter of medial FMH groove was found in Homo naledi, while in lateral FMH groove it was the highest in volleyball player, suggesting their increased bear loading. The parallelism of medial and lateral FMH grooves was observed only in Homo naledi, while in investigated athletes it was dis-parallel. Athletes' dis-paralleled structures make first MTPJ simple flexion movement a complicated one: not rotating about one axis, but about many, which may result in bringing a negative effect on running. In conclusion, the presented method for the reconstruction of the damaged foot bone paves the way for morphological and structural analysis of modern population and fossil hominins' gait pattern
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction
Large language models with instruction-following capabilities open the door
to a wider group of users. However, when it comes to information extraction - a
classic task in natural language processing - most task-specific systems cannot
align well with long-tail ad hoc extraction use cases for non-expert users. To
address this, we propose a novel paradigm, termed On-Demand Information
Extraction, to fulfill the personalized demands of real-world users. Our task
aims to follow the instructions to extract the desired content from the
associated text and present it in a structured tabular format. The table
headers can either be user-specified or inferred contextually by the model. To
facilitate research in this emerging area, we present a benchmark named
InstructIE, inclusive of both automatically generated training data, as well as
the human-annotated test set. Building on InstructIE, we further develop an
On-Demand Information Extractor, ODIE. Comprehensive evaluations on our
benchmark reveal that ODIE substantially outperforms the existing open-source
models of similar size. Our code and dataset are released on
https://github.com/yzjiao/On-Demand-IE.Comment: EMNLP 202
Learning the 3D Fauna of the Web
Learning 3D models of all animals on the Earth requires massively scaling up
existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an
approach that learns a pan-category deformable 3D animal model for more than
100 animal species jointly. One crucial bottleneck of modeling animals is the
limited availability of training data, which we overcome by simply learning
from 2D Internet images. We show that prior category-specific attempts fail to
generalize to rare species with limited training images. We address this
challenge by introducing the Semantic Bank of Skinned Models (SBSM), which
automatically discovers a small set of base animal shapes by combining
geometric inductive priors with semantic knowledge implicitly captured by an
off-the-shelf self-supervised feature extractor. To train such a model, we also
contribute a new large-scale dataset of diverse animal species. At inference
time, given a single image of any quadruped animal, our model reconstructs an
articulated 3D mesh in a feed-forward fashion within seconds.Comment: The first two authors contributed equally to this work. The last
three authors contributed equally. Project page:
https://kyleleey.github.io/3DFauna
Learning the 3D fauna of the web
Learning 3D models of all animals in nature requires
massively scaling up existing solutions. With this ultimate
goal in mind, we develop 3D-Fauna, an approach that
learns a pan-category deformable 3D animal model for
more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by learning our model from
2D Internet images. We show that prior approaches, which
are category-specific, fail to generalize to rare species with
limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM),
which automatically discovers a small set of base animal
shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf
self-supervised feature extractor. To train such a model,
we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any
quadruped animal, our model reconstructs an articulated
3D mesh in a feed-forward manner in seconds
Learning the 3D fauna of the web
Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by simply learning from 2D Internet images. We show that prior category-specific attempts fail to generalize to rare species with limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM), which automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf self-supervised feature extractor. To train such a model, we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any quadruped animal, our model reconstructs an articulated 3D mesh in a feed-forward fashion within seconds
Impact of pesticide outsourcing services on farmersā low-carbon production behavior
Introduction: Promoting low-carbon development in agriculture is crucial for achieving agricultural modernization. One practical issue worth studying is whether outsourcing services can encourage farmers to adopt low-carbon production practices. This study analyzes the impact of pesticide outsourcing services on the low-carbon production behavior of farmers to provide China with practical recommendations.Methods: This empirical study investigates the impact of pesticide outsourcing services on farmersā low-carbon production behavior using survey data from 450 rice growers in the Ningxia and Shaanxi provinces by endogenous switching regressions (ESR) model.Results and Discussion: Results showed that 1) outsourcing services have a significant negative impact on farmersā manual weeding behavior, leading to a reduction in the frequency of manual weeding; 2) outsourcing services have a significant positive impact on farmersā herbicide application behavior. In other words, participation in outsourcing leads to excessive pesticide application; 3) outsourcing services do not support a green and low-carbon production model where manual weeding replaces herbicide application. Due to the imperfect development of the outsourcing market in China, especially in the northwest region, the construction of outsourcing service system is lagging, and it is difficult for non-professional outsourcing services to play a driving role in green and low-carbon production for farmers, who will often choose the lower-cost mechanical application for maximum profit. The policy implication of this study is the need for a comprehensive and objective understanding of the impact and role of pesticide outsourcing services on farmersā low-carbon production behavior. This understanding can help improve the market, policy, and other external environments for farmers to participate in outsourcing, ultimately promoting the sustainable development of green and low-carbon agriculture. This paper adds to the discussion of pesticide outsourcing services and farmersā low-carbon production by drawing different conclusions from previous studies, providing a fresh foundation for policy-making
- ā¦