696 research outputs found
A Study of Creative Treason in Red Sorghum: From the Perspective of Rewriting Theory
In literary translation, the translated works always deviate from and distort the original texts to a certain extent because of the linguistic and cultural differences between nations. Therefore, Robert Escarpit,a French literature sociologist, puts forward the notion of “creative treason”. He claims that “translation is always a kind of creative treason”. Translated by Goldblatt, Red Sorghum, the English version of Hong Gao Liang Jia Zu which is written by Mo Yan and a representative of contemporary Chinese novels, is no exception. However, the traditional translation approach cannot give a full explanation to the “creative treason” in Goldblatt’s translation, let alone judge the translator and his translated work objectively and fairly. Since translation studies took the "cultural turn" in 1970s, some translation theorists have adopted a descriptive method to analyze translation from the socio-cultural perspective. During this process, Lefevere, the leading figure of Manipulation School, deserves special attention. The Rewriting Theory proposed by him expanded the horizon of translation studies. Taking this thoery as theoretical foundation, this paper aims to analyze the underlying causes for Goldblatt’s “creative treason” in Red Sorghum and explore a new route in research of Goldblatt and his translations. Meanwhile, the author hopes that this paper can bring some suggestions to the transmission of contemporary Chinese literature abroad
Learning Attentive Pairwise Interaction for Fine-Grained Classification
Fine-grained classification is a challenging problem, due to subtle
differences among highly-confused categories. Most approaches address this
difficulty by learning discriminative representation of individual input image.
On the other hand, humans can effectively identify contrastive clues by
comparing image pairs. Inspired by this fact, this paper proposes a simple but
effective Attentive Pairwise Interaction Network (API-Net), which can
progressively recognize a pair of fine-grained images by interaction.
Specifically, API-Net first learns a mutual feature vector to capture semantic
differences in the input pair. It then compares this mutual vector with
individual vectors to generate gates for each input image. These distinct gate
vectors inherit mutual context on semantic differences, which allow API-Net to
attentively capture contrastive clues by pairwise interaction between two
images. Additionally, we train API-Net in an end-to-end manner with a score
ranking regularization, which can further generalize API-Net by taking feature
priorities into account. We conduct extensive experiments on five popular
benchmarks in fine-grained classification. API-Net outperforms the recent SOTA
methods, i.e., CUB-200-2011 (90.0%), Aircraft(93.9%), Stanford Cars (95.3%),
Stanford Dogs (90.3%), and NABirds (88.1%).Comment: Accepted at AAAI-202
CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm
Point cloud completion aims to predict complete shape from its partial
observation. Current approaches mainly consist of generation and refinement
stages in a coarse-to-fine style. However, the generation stage often lacks
robustness to tackle different incomplete variations, while the refinement
stage blindly recovers point clouds without the semantic awareness. To tackle
these challenges, we unify point cloud Completion by a generic
Pretrain-Prompt-Predict paradigm, namely CP3. Inspired by prompting approaches
from NLP, we creatively reinterpret point cloud generation and refinement as
the prompting and predicting stages, respectively. Then, we introduce a concise
self-supervised pretraining stage before prompting. It can effectively increase
robustness of point cloud generation, by an Incompletion-Of-Incompletion (IOI)
pretext task. Moreover, we develop a novel Semantic Conditional Refinement
(SCR) network at the predicting stage. It can discriminatively modulate
multi-scale refinement with the guidance of semantics. Finally, extensive
experiments demonstrate that our CP3 outperforms the state-of-the-art methods
with a large margin
Impact of agricultural activities on pesticide residues in soil of edible bamboo shoot plantations
Edible bamboo shoot is one of the most important vegetables in Asian countries. Intensive agricultural management measures can cause many negative influences, such as soil acidification and excessive pesticide residues. In the present study, more than 300 soil samples were collected from edible bamboo shoot plantations in six areas throughout Zhejiang province, China, to investigate the soil pesticide pollution and its change after different agricultural activities. Thirteen organic chemicals were detected; nine less than that detected during a similar study executed in 2003–2004. All the detected residues were far below the Chinese national environmental standards for agricultural soils. The pesticide residues in bamboo plantations showed a decline over the past decade. Organic materials used for mulching and plantation’s background of being formerly a paddy field are two important factors increasing the pesticide residues. Conversely, lime application to acidified soil and mulching with uncontaminated new mountain soil could decrease the residues significantly. Our results indicated that the current agricultural activities are efficient in reducing pesticide residues in the soil of bamboo shoot plantations and should be further promoted
Context-Transformer: Tackling Object Confusion for Few-Shot Detection
Few-shot object detection is a challenging but realistic scenario, where only
a few annotated training images are available for training detectors. A popular
approach to handle this problem is transfer learning, i.e., fine-tuning a
detector pretrained on a source-domain benchmark. However, such transferred
detector often fails to recognize new objects in the target domain, due to low
data diversity of training samples. To tackle this problem, we propose a novel
Context-Transformer within a concise deep transfer framework. Specifically,
Context-Transformer can effectively leverage source-domain object knowledge as
guidance, and automatically exploit contexts from only a few training images in
the target domain. Subsequently, it can adaptively integrate these relational
clues to enhance the discriminative power of detector, in order to reduce
object confusion in few-shot scenarios. Moreover, Context-Transformer is
flexibly embedded in the popular SSD-style detectors, which makes it a
plug-and-play module for end-to-end few-shot learning. Finally, we evaluate
Context-Transformer on the challenging settings of few-shot detection and
incremental few-shot detection. The experimental results show that, our
framework outperforms the recent state-of-the-art approaches.Comment: Accepted by AAAI-202
PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos
The end-to-end Human Mesh Recovery (HMR) approach has been successfully used
for 3D body reconstruction. However, most HMR-based frameworks reconstruct
human body by directly learning mesh parameters from images or videos, while
lacking explicit guidance of 3D human pose in visual data. As a result, the
generated mesh often exhibits incorrect pose for complex activities. To tackle
this problem, we propose to exploit 3D pose to calibrate human mesh.
Specifically, we develop two novel Pose Calibration frameworks, i.e., Serial
PC-HMR and Parallel PC-HMR. By coupling advanced 3D pose estimators and HMR in
a serial or parallel manner, these two frameworks can effectively correct human
mesh with guidance of a concise pose calibration module. Furthermore, since the
calibration module is designed via non-rigid pose transformation, our PC-HMR
frameworks can flexibly tackle bone length variations to alleviate misplacement
in the calibrated mesh. Finally, our frameworks are based on generic and
complementary integration of data-driven learning and geometrical modeling. Via
plug-and-play modules, they can be efficiently adapted for both
image/video-based human mesh recovery. Additionally, they have no requirement
of extra 3D pose annotations in the testing phase, which releases inference
difficulties in practice. We perform extensive experiments on the popular
bench-marks, i.e., Human3.6M, 3DPW and SURREAL, where our PC-HMR frameworks
achieve the SOTA results.Comment: 9 pages, 7 figures. AAAI202
New environmental dependent modelling with Gaussian particle filtering based implementation for ground vehicle tracking
This paper proposes a new domain knowledge aided Gaussian particle filtering based approach for the ground vehicle tracking application. Firstly, a new form of modelling is proposed to reflect the influences of different types of environmental domain knowledge on the vehicle dynamic: i) a non-Markov jump model is applied with multiple models while transition probabilities between models are environmental dependent ii) for a particular model, both the constraints and potential forces obtained from the surrounding environment have been applied to refine the vehicle state distribution. Based on the proposed modelling approach, a Gaussian particle filtering based method is developed to implement the related Bayesian inference for the target state estimation. Simulation studies from multiple Monte Carlo simulations confirm the advantages of the proposed method over traditional ones, from both the modelling and implementation aspects
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Video Foundation Models (VFMs) have received limited exploration due to high
computational costs and data scarcity. Previous VFMs rely on Image Foundation
Models (IFMs), which face challenges in transferring to the video domain.
Although VideoMAE has trained a robust ViT from limited data, its low-level
reconstruction poses convergence difficulties and conflicts with high-level
cross-modal alignment. This paper proposes a training-efficient method for
temporal-sensitive VFMs that integrates the benefits of existing methods. To
increase data efficiency, we mask out most of the low-semantics video tokens,
but selectively align the unmasked tokens with IFM, which serves as the
UnMasked Teacher (UMT). By providing semantic guidance, our method enables
faster convergence and multimodal friendliness. With a progressive pre-training
framework, our model can handle various tasks including scene-related,
temporal-related, and complex video-language understanding. Using only public
sources for pre-training in 6 days on 32 A100 GPUs, our scratch-built ViT-L/16
achieves state-of-the-art performances on various video tasks. The code and
models will be released at https://github.com/OpenGVLab/unmasked_teacher.Comment: 16 pages, 5 figures, 28 table
- …