88 research outputs found
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
With the availability of large-scale video datasets and the advances of
diffusion models, text-driven video generation has achieved substantial
progress. However, existing video generation models are typically trained on a
limited number of frames, resulting in the inability to generate high-fidelity
long videos during inference. Furthermore, these models only support
single-text conditions, whereas real-life scenarios often require multi-text
conditions as the video content changes over time. To tackle these challenges,
this study explores the potential of extending the text-driven capability to
generate longer videos conditioned on multiple texts. 1) We first analyze the
impact of initial noise in video diffusion models. Then building upon the
observation of noise, we propose FreeNoise, a tuning-free and time-efficient
paradigm to enhance the generative capabilities of pretrained video diffusion
models while preserving content consistency. Specifically, instead of
initializing noises for all frames, we reschedule a sequence of noises for
long-range correlation and perform temporal attention over them by window-based
function. 2) Additionally, we design a novel motion injection method to support
the generation of videos conditioned on multiple text prompts. Extensive
experiments validate the superiority of our paradigm in extending the
generative capabilities of video diffusion models. It is noteworthy that
compared with the previous best-performing method which brought about 255%
extra time cost, our method incurs only negligible time cost of approximately
17%. Generated video samples are available at our website:
http://haonanqiu.com/projects/FreeNoise.html.Comment: Project Page: http://haonanqiu.com/projects/FreeNoise.html Code Repo:
https://github.com/arthur-qiu/LongerCrafte
ReliTalk: Relightable Talking Portrait Generation from a Single Video
Recent years have witnessed great progress in creating vivid audio-driven
portraits from monocular videos. However, how to seamlessly adapt the created
video avatars to other scenarios with different backgrounds and lighting
conditions remains unsolved. On the other hand, existing relighting studies
mostly rely on dynamically lighted or multi-view data, which are too expensive
for creating video portraits. To bridge this gap, we propose ReliTalk, a novel
framework for relightable audio-driven talking portrait generation from
monocular videos. Our key insight is to decompose the portrait's reflectance
from implicitly learned audio-driven facial normals and images. Specifically,
we involve 3D facial priors derived from audio features to predict delicate
normal maps through implicit functions. These initially predicted normals then
take a crucial part in reflectance decomposition by dynamically estimating the
lighting condition of the given video. Moreover, the stereoscopic face
representation is refined using the identity-consistent loss under simulated
multiple lighting conditions, addressing the ill-posed problem caused by
limited views available from a single monocular video. Extensive experiments
validate the superiority of our proposed framework on both real and synthetic
datasets. Our code is released in https://github.com/arthur-qiu/ReliTalk
Design of knowledge-based systems for automated deployment of building management services
Despite its high potential, the building's sector lags behind in reducing its energy demand. Tremendous savings can be achieved by deploying building management services during operation, however, the manual deployment of these services needs to be undertaken by experts and it is a tedious, time and cost consuming task. It requires detailed expert knowledge to match the diverse requirements of services with the present constellation of envelope, equipment and automation system in a target building. To enable the widespread deployment of these services, this knowledge-intensive task needs to be automated. Knowledge-based methods solve this task, however, their widespread adoption is hampered and solutions proposed in the past do not stick to basic principles of state of the art knowledge engineering methods. To fill this gap we present a novel methodological approach for the design of knowledge-based systems for the automated deployment of building management services. The approach covers the essential steps and best practices: (1) representation of terminological knowledge of a building and its systems based on well-established knowledge engineering methods; (2) representation and capturing of assertional knowledge on a real building portfolio based on open standards; and (3) use of the acquired knowledge for the automated deployment of building management services to increase the energy efficiency of buildings during operation. We validate the methodological approach by deploying it in a real-world large-scale European pilot on a diverse portfolio of buildings and a novel set of building management services. In addition, a novel ontology, which reuses and extends existing ontologies is presented.The authors would like to gratefully acknowledge the generous funding provided by the European Union’s Horizon 2020 research and innovation programme through the MOEEBIUS project under grant agreement No. 680517
Horizontal structure of convergent wind shear associated with sporadic E layers over East Asia
At present, the main detection instruments for observing sporadic E (Es) layers are ground-based radars, dense networks of ground-based global navigation satellite system (GNSS) receivers, and GNSS radio occultation, but they cannot capture the whole picture of the horizontal structure of Es layers. This study employs the Whole Atmosphere Community Climate Model with thermosphere and ionosphere eXtension model (WACCM-X 2.1) to derive the horizontal structure of the ion convergence region (HSICR) to explore the shapes of the large-scale Es layers over East Asia for the period from June 1 to August 31, 2008. The simulation produced the various shapes of the HSICRs elongated in the northwest−southeast, northeast−southwest, or composed of individual small patches. The close connection between Es layer critical frequency (foEs) and vertical ion convergence indicates that the HSICR is a good candidate for revealing and explaining the horizontal structure of the large-scale Es layers
PyPose: A Library for Robot Learning with Physics-based Optimization
Deep learning has had remarkable success in robotic perception, but its
data-centric nature suffers when it comes to generalizing to ever-changing
environments. By contrast, physics-based optimization generalizes better, but
it does not perform as well in complicated tasks due to the lack of high-level
semantic information and the reliance on manual parametric tuning. To take
advantage of these two complementary worlds, we present PyPose: a
robotics-oriented, PyTorch-based library that combines deep perceptual models
with physics-based optimization techniques. Our design goal for PyPose is to
make it user-friendly, efficient, and interpretable with a tidy and
well-organized architecture. Using an imperative style interface, it can be
easily integrated into real-world robotic applications. Besides, it supports
parallel computing of any order gradients of Lie groups and Lie algebras and
-order optimizers, such as trust region methods. Experiments
show that PyPose achieves 3-20 speedup in computation compared to
state-of-the-art libraries. To boost future research, we provide concrete
examples across several fields of robotics, including SLAM, inertial
navigation, planning, and control
PyPose v0.6: The Imperative Programming Interface for Robotics
PyPose is an open-source library for robot learning. It combines a
learning-based approach with physics-based optimization, which enables seamless
end-to-end robot learning. It has been used in many tasks due to its
meticulously designed application programming interface (API) and efficient
implementation. From its initial launch in early 2022, PyPose has experienced
significant enhancements, incorporating a wide variety of new features into its
platform. To satisfy the growing demand for understanding and utilizing the
library and reduce the learning curve of new users, we present the fundamental
design principle of the imperative programming interface, and showcase the
flexible usage of diverse functionalities and modules using an extremely simple
Dubins car example. We also demonstrate that the PyPose can be easily used to
navigate a real quadruped robot with a few lines of code
Potential of Core-Collapse Supernova Neutrino Detection at JUNO
JUNO is an underground neutrino observatory under construction in Jiangmen, China. It uses 20kton liquid scintillator as target, which enables it to detect supernova burst neutrinos of a large statistics for the next galactic core-collapse supernova (CCSN) and also pre-supernova neutrinos from the nearby CCSN progenitors. All flavors of supernova burst neutrinos can be detected by JUNO via several interaction channels, including inverse beta decay, elastic scattering on electron and proton, interactions on C12 nuclei, etc. This retains the possibility for JUNO to reconstruct the energy spectra of supernova burst neutrinos of all flavors. The real time monitoring systems based on FPGA and DAQ are under development in JUNO, which allow prompt alert and trigger-less data acquisition of CCSN events. The alert performances of both monitoring systems have been thoroughly studied using simulations. Moreover, once a CCSN is tagged, the system can give fast characterizations, such as directionality and light curve
Detection of the Diffuse Supernova Neutrino Background with JUNO
As an underground multi-purpose neutrino detector with 20 kton liquid scintillator, Jiangmen Underground Neutrino Observatory (JUNO) is competitive with and complementary to the water-Cherenkov detectors on the search for the diffuse supernova neutrino background (DSNB). Typical supernova models predict 2-4 events per year within the optimal observation window in the JUNO detector. The dominant background is from the neutral-current (NC) interaction of atmospheric neutrinos with 12C nuclei, which surpasses the DSNB by more than one order of magnitude. We evaluated the systematic uncertainty of NC background from the spread of a variety of data-driven models and further developed a method to determine NC background within 15\% with {\it{in}} {\it{situ}} measurements after ten years of running. Besides, the NC-like backgrounds can be effectively suppressed by the intrinsic pulse-shape discrimination (PSD) capabilities of liquid scintillators. In this talk, I will present in detail the improvements on NC background uncertainty evaluation, PSD discriminator development, and finally, the potential of DSNB sensitivity in JUNO
Real-time Monitoring for the Next Core-Collapse Supernova in JUNO
Core-collapse supernova (CCSN) is one of the most energetic astrophysical
events in the Universe. The early and prompt detection of neutrinos before
(pre-SN) and during the SN burst is a unique opportunity to realize the
multi-messenger observation of the CCSN events. In this work, we describe the
monitoring concept and present the sensitivity of the system to the pre-SN and
SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is
a 20 kton liquid scintillator detector under construction in South China. The
real-time monitoring system is designed with both the prompt monitors on the
electronic board and online monitors at the data acquisition stage, in order to
ensure both the alert speed and alert coverage of progenitor stars. By assuming
a false alert rate of 1 per year, this monitoring system can be sensitive to
the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos
up to about 370 (360) kpc for a progenitor mass of 30 for the case
of normal (inverted) mass ordering. The pointing ability of the CCSN is
evaluated by using the accumulated event anisotropy of the inverse beta decay
interactions from pre-SN or SN neutrinos, which, along with the early alert,
can play important roles for the followup multi-messenger observations of the
next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure
- …