189 research outputs found
Justifying a privacy guardian in discourse and behaviour : the People’s Republic of China’s strategic framing in data governance
The People’s Republic of China’s (PRC) approach to data governance, centred on data sovereignty, is much debated in academic literature. However, it remains unclear how the PRC’s different state actors justify this approach. Based on an analysis of the discourse and behaviour of the PRC’s state actors through strategic framing theory, their role as a privacy guardian can arguably be described as strategically constructed. The Chinese government and legislative bodies have tailored their communications to present themselves as champions of individual privacy, aiming to secure support for state policies. This strategic framing encompasses four mechanisms: the reframing of privacy threats through political narratives; legal ambiguities; selective framing; and the implementation of censorship to influence public discourse. An examination of how the Chinese government responded differently to data breaches in the cases of Didi and the Shanghai National Police Database leak highlights the Chinese government’s efforts in maintaining framing consistency to construct itself as a guardian, rather than a violator, of individual privacy.Peer reviewe
DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion
We present DiverseMotion, a new approach for synthesizing high-quality human
motions conditioned on textual descriptions while preserving motion
diversity.Despite the recent significant process in text-based human motion
generation,existing methods often prioritize fitting training motions at the
expense of action diversity. Consequently, striking a balance between motion
quality and diversity remains an unresolved challenge. This problem is
compounded by two key factors: 1) the lack of diversity in motion-caption pairs
in existing benchmarks and 2) the unilateral and biased semantic understanding
of the text prompt, focusing primarily on the verb component while neglecting
the nuanced distinctions indicated by other words.In response to the first
issue, we construct a large-scale Wild Motion-Caption dataset (WMC) to extend
the restricted action boundary of existing well-annotated datasets, enabling
the learning of diverse motions through a more extensive range of actions. To
this end, a motion BLIP is trained upon a pretrained vision-language model,
then we automatically generate diverse motion captions for the collected motion
sequences. As a result, we finally build a dataset comprising 8,888 motions
coupled with 141k text.To comprehensively understand the text command, we
propose a Hierarchical Semantic Aggregation (HSA) module to capture the
fine-grained semantics.Finally,we involve the above two designs into an
effective Motion Discrete Diffusion (MDD) framework to strike a balance between
motion quality and diversity. Extensive experiments on HumanML3D and KIT-ML
show that our DiverseMotion achieves the state-of-the-art motion quality and
competitive motion diversity. Dataset, code, and pretrained models will be
released to reproduce all of our results.Comment: 12 pages, 7 figure
Dual Relation Alignment for Composed Image Retrieval
Composed image retrieval, a task involving the search for a target image
using a reference image and a complementary text as the query, has witnessed
significant advancements owing to the progress made in cross-modal modeling.
Unlike the general image-text retrieval problem with only one alignment
relation, i.e., image-text, we argue for the existence of two types of
relations in composed image retrieval. The explicit relation pertains to the
reference image & complementary text-target image, which is commonly exploited
by existing methods. Besides this intuitive relation, the observations during
our practice have uncovered another implicit yet crucial relation, i.e.,
reference image & target image-complementary text, since we found that the
complementary text can be inferred by studying the relation between the target
image and the reference image. Regrettably, existing methods largely focus on
leveraging the explicit relation to learn their networks, while overlooking the
implicit relation. In response to this weakness, We propose a new framework for
composed image retrieval, termed dual relation alignment, which integrates both
explicit and implicit relations to fully exploit the correlations among the
triplets. Specifically, we design a vision compositor to fuse reference image
and target image at first, then the resulted representation will serve two
roles: (1) counterpart for semantic alignment with the complementary text and
(2) compensation for the complementary text to boost the explicit relation
modeling, thereby implant the implicit relation into the alignment learning.
Our method is evaluated on two popular datasets, CIRR and FashionIQ, through
extensive experiments. The results confirm the effectiveness of our
dual-relation learning in substantially enhancing composed image retrieval
performance
Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
In this paper, we introduce a large Multi-Attribute and Language Search
dataset for text-based person retrieval, called MALS, and explore the
feasibility of performing pre-training on both attribute recognition and
image-text matching tasks in one stone. In particular, MALS contains 1,510,330
image-text pairs, which is about 37.5 times larger than prevailing CUHK-PEDES,
and all images are annotated with 27 attributes. Considering the privacy
concerns and annotation costs, we leverage the off-the-shelf diffusion models
to generate the dataset. To verify the feasibility of learning from the
generated data, we develop a new joint Attribute Prompt Learning and Text
Matching Learning (APTM) framework, considering the shared knowledge between
attribute and text. As the name implies, APTM contains an attribute prompt
learning stream and a text matching learning stream. (1) The attribute prompt
learning leverages the attribute prompts for image-attribute alignment, which
enhances the text matching learning. (2) The text matching learning facilitates
the representation learning on fine-grained details, and in turn, boosts the
attribute prompt learning. Extensive experiments validate the effectiveness of
the pre-training on MALS, achieving state-of-the-art retrieval performance via
APTM on three challenging real-world benchmarks. In particular, APTM achieves a
consistent improvement of +6.96%, +7.68%, and +16.95% Recall@1 accuracy on
CUHK-PEDES, ICFG-PEDES, and RSTPReid datasets by a clear margin, respectively
The influence of 1-MCP on the fruit quality and flesh browning of ‘Red Fuji’ apple after long-term cold storage
This study assessed the influence of 1-MCP treatment on the fruit quality and flesh browning of ‘Red Fuji’ apple at shelf life after long-term cold storage. The ‘Red Fuji’ fruit were stored at 0±0.5 °C for 270 days after treating with 1.0 μL L-1 1-methylcyclopropylene (1-MCP). Fruit quality, browning rate of stem-end flesh, chlorogenic acid content, polyphenol oxidase (PPO) activity were analyzed at shelf-life under 20±0.5 °C, the expression profile of ethylene receptors (MdERS1), phenylalnine ammonia lyase genes (MdPA L1, MdPA L2), quinate hydroxycinnamoyl/hydrxycinnamoyl CoA shi-kimate gene (MdHCT3), polyphenol oxidase genes (MdPPO1, MdPPO5)and lipoxygenase gene (MdLOX) were measured by real-time quantitative PCR.
1-MCP treatment improved the fruit storage quality, decreased stem-end flesh tissue browning, and fruit decay. In addition, the fruit respiration rate and ethylene production rate increased at shelf-life, but this increase could be inhibited by 1-MCP. The same rule was observed in the changes of chlorogenic acid content and PPO activity, the expression of MdERS1, MdPA L1, MdPPO1 and MdLOX were inhibited by 1-MCP as well in the stem-end flesh. Thus, 1-MCP treatment improves the fruit quality of ‘Red Fuji’ apple at shelf-life after long-term cold storage, and inhibits the browning of stem-end flesh by decreasing the chlorogenic acid content and PPO activity. MdPA L1, MdHCT3, MdPPO1 and MdLOX participate in the flesh browning progress
AdaEvo: Edge-Assisted Continuous and Timely DNN Model Evolution for Mobile Devices
Mobile video applications today have attracted significant attention. Deep
learning model (e.g. deep neural network, DNN) compression is widely used to
enable on-device inference for facilitating robust and private mobile video
applications. The compressed DNN, however, is vulnerable to the agnostic data
drift of the live video captured from the dynamically changing mobile
scenarios. To combat the data drift, mobile ends rely on edge servers to
continuously evolve and re-compress the DNN with freshly collected data. We
design a framework, AdaEvo, that efficiently supports the resource-limited edge
server handling mobile DNN evolution tasks from multiple mobile ends. The key
goal of AdaEvo is to maximize the average quality of experience (QoE), e.g. the
proportion of high-quality DNN service time to the entire life cycle, for all
mobile ends. Specifically, it estimates the DNN accuracy drops at the mobile
end without labels and performs a dedicated video frame sampling strategy to
control the size of retraining data. In addition, it balances the limited
computing and memory resources on the edge server and the competition between
asynchronous tasks initiated by different mobile users. With an extensive
evaluation of real-world videos from mobile scenarios and across four diverse
mobile tasks, experimental results show that AdaEvo enables up to 34% accuracy
improvement and 32% average QoE improvement.Comment: Accepted by IEEE Transactions on Mobile Computing 202
- …