155 research outputs found
DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders
Image colorization is a challenging problem due to multi-modal uncertainty
and high ill-posedness. Directly training a deep neural network usually leads
to incorrect semantic colors and low color richness. While transformer-based
methods can deliver better results, they often rely on manually designed
priors, suffer from poor generalization ability, and introduce color bleeding
effects. To address these issues, we propose DDColor, an end-to-end method with
dual decoders for image colorization. Our approach includes a pixel decoder and
a query-based color decoder. The former restores the spatial resolution of the
image, while the latter utilizes rich visual features to refine color queries,
thus avoiding hand-crafted priors. Our two decoders work together to establish
correlations between color and multi-scale semantic representations via
cross-attention, significantly alleviating the color bleeding effect.
Additionally, a simple yet effective colorfulness loss is introduced to enhance
the color richness. Extensive experiments demonstrate that DDColor achieves
superior performance to existing state-of-the-art works both quantitatively and
qualitatively. The codes and models are publicly available at
https://github.com/piddnad/DDColor.Comment: ICCV 2023; Code: https://github.com/piddnad/DDColo
RSFNet: A White-Box Image Retouching Approach using Region-Specific Color Filters
Retouching images is an essential aspect of enhancing the visual appeal of
photos. Although users often share common aesthetic preferences, their
retouching methods may vary based on their individual preferences. Therefore,
there is a need for white-box approaches that produce satisfying results and
enable users to conveniently edit their images simultaneously. Recent white-box
retouching methods rely on cascaded global filters that provide image-level
filter arguments but cannot perform fine-grained retouching. In contrast,
colorists typically employ a divide-and-conquer approach, performing a series
of region-specific fine-grained enhancements when using traditional tools like
Davinci Resolve. We draw on this insight to develop a white-box framework for
photo retouching using parallel region-specific filters, called RSFNet. Our
model generates filter arguments (e.g., saturation, contrast, hue) and
attention maps of regions for each filter simultaneously. Instead of cascading
filters, RSFNet employs linear summations of filters, allowing for a more
diverse range of filter classes that can be trained more easily. Our
experiments demonstrate that RSFNet achieves state-of-the-art results, offering
satisfying aesthetic appeal and increased user convenience for editable
white-box retouching.Comment: Accepted by ICCV 202
Graphical Consensus-based Sharding for Efficient and Secure Sharings in Blockchain-enabled Internet of Vehicles
Acoustic Holographic Rendering with Two-dimensional Metamaterial-based Passive Phased Array.
Acoustic holographic rendering in complete analogy with optical holography are useful for various applications, ranging from multi-focal lensing, multiplexed sensing and synthesizing three-dimensional complex sound fields. Conventional approaches rely on a large number of active transducers and phase shifting circuits. In this paper we show that by using passive metamaterials as subwavelength pixels, holographic rendering can be achieved without cumbersome circuitry and with only a single transducer, thus significantly reducing system complexity. Such metamaterial-based holograms can serve as versatile platforms for various advanced acoustic wave manipulation and signal modulation, leading to new possibilities in acoustic sensing, energy deposition and medical diagnostic imaging
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems
Recently, there has been an emergence of employing LLM-powered agents as
believable human proxies, based on their remarkable decision-making capability.
However, existing studies mainly focus on simulating human dialogue. Human
non-verbal behaviors, such as item clicking in recommender systems, although
implicitly exhibiting user preferences and could enhance the modeling of users,
have not been deeply explored. The main reasons lie in the gap between language
modeling and behavior modeling, as well as the incomprehension of LLMs about
user-item relations.
To address this issue, we propose AgentCF for simulating user-item
interactions in recommender systems through agent-based collaborative
filtering. We creatively consider not only users but also items as agents, and
develop a collaborative learning approach that optimizes both kinds of agents
together. Specifically, at each time step, we first prompt the user and item
agents to interact autonomously. Then, based on the disparities between the
agents' decisions and real-world interaction records, user and item agents are
prompted to reflect on and adjust the misleading simulations collaboratively,
thereby modeling their two-sided relations. The optimized agents can also
propagate their preferences to other agents in subsequent interactions,
implicitly capturing the collaborative filtering idea. Overall, the optimized
agents exhibit diverse interaction behaviors within our framework, including
user-item, user-user, item-item, and collective interactions. The results show
that these agents can demonstrate personalized behaviors akin to those of
real-world individuals, sparking the development of next-generation user
behavior simulation
- …