153 research outputs found
Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process with Uncertainty Quantification
Spatio-temporal point processes (STPPs) are potent mathematical tools for
modeling and predicting events with both temporal and spatial features. Despite
their versatility, most existing methods for learning STPPs either assume a
restricted form of the spatio-temporal distribution, or suffer from inaccurate
approximations of the intractable integral in the likelihood training
objective. These issues typically arise from the normalization term of the
probability density function. Moreover, current techniques fail to provide
uncertainty quantification for model predictions, such as confidence intervals
for the predicted event's arrival time and confidence regions for the event's
location, which is crucial given the considerable randomness of the data. To
tackle these challenges, we introduce SMASH: a Score MAtching-based
pSeudolikeliHood estimator for learning marked STPPs with uncertainty
quantification. Specifically, our framework adopts a normalization-free
objective by estimating the pseudolikelihood of marked STPPs through
score-matching and offers uncertainty quantification for the predicted event
time, location and mark by computing confidence regions over the generated
samples. The superior performance of our proposed framework is demonstrated
through extensive experiments in both event prediction and uncertainty
quantification
Multi-view 3D Face Reconstruction Based on Flame
At present, face 3D reconstruction has broad application prospects in various
fields, but the research on it is still in the development stage. In this
paper, we hope to achieve better face 3D reconstruction quality by combining
multi-view training framework with face parametric model Flame, propose a
multi-view training and testing model MFNet (Multi-view Flame Network). We
build a self-supervised training framework and implement constraints such as
multi-view optical flow loss function and face landmark loss, and finally
obtain a complete MFNet. We propose innovative implementations of multi-view
optical flow loss and the covisible mask. We test our model on AFLW and
facescape datasets and also take pictures of our faces to reconstruct 3D faces
while simulating actual scenarios as much as possible, which achieves good
results. Our work mainly addresses the problem of combining parametric models
of faces with multi-view face 3D reconstruction and explores the implementation
of a Flame based multi-view training and testing framework for contributing to
the field of face 3D reconstruction
A Novel Hierarchically Porous Polypyrrole Sphere Modified Separator for Lithium-Sulfur Batteries
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6723804/The commercialization of Lithium-sulfur batteries was limited by the polysulfide shuttle effect, and modifying the routine separator was an effective method to solve this problem. In this work, a novel hierarchically porous polypyrrole sphere (PPS) was successfully prepared by using silica as hard-templates. As-prepared PPS was slurry-coated on the separator, which could reduce the polarization phenomenon of the sulfur cathode, and efficiently immobilize polysulfides. As expected, high sulfur utilization was achieved by suppressing the shuttle effect. When tested in the lithium-sulfur battery, it exhibited a high capacity of 855 mAh·g−1 after 100 cycles at 0.2 C, and delivered a reversible capacity of 507 mAh·g−1 at 3 C, showing excellent electrochemical performance
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability
Because of its streaming nature, recurrent neural network transducer (RNN-T)
is a very promising end-to-end (E2E) model that may replace the popular hybrid
model for automatic speech recognition. In this paper, we describe our recent
development of RNN-T models with reduced GPU memory consumption during
training, better initialization strategy, and advanced encoder modeling with
future lookahead. When trained with Microsoft's 65 thousand hours of anonymized
training data, the developed RNN-T model surpasses a very well trained hybrid
model with both better recognition accuracy and lower latency. We further study
how to customize RNN-T models to a new domain, which is important for deploying
E2E models to practical scenarios. By comparing several methods leveraging
text-only data in the new domain, we found that updating RNN-T's prediction and
joint networks using text-to-speech generated from domain-specific text is the
most effective.Comment: Accepted by Interspeech 202
- …