286 research outputs found
Facial Motion Prior Networks for Facial Expression Recognition
Deep learning based facial expression recognition (FER) has received a lot of
attention in the past few years. Most of the existing deep learning based FER
methods do not consider domain knowledge well, which thereby fail to extract
representative features. In this work, we propose a novel FER framework, named
Facial Motion Prior Networks (FMPN). Particularly, we introduce an addition
branch to generate a facial mask so as to focus on facial muscle moving
regions. To guide the facial mask learning, we propose to incorporate prior
domain knowledge by using the average differences between neutral faces and the
corresponding expressive faces as the training guidance. Extensive experiments
on three facial expression benchmark datasets demonstrate the effectiveness of
the proposed method, compared with the state-of-the-art approaches.Comment: VCIP 2019, Oral. Code is available at
https://github.com/donydchen/FMPN-FE
Evaluation of the POSSUM, p-POSSUM, o-POSSUM, and APACHE II scoring systems in predicting postoperative mortality and morbidity in gastric cancer patients
SummaryBackground/ObjectiveGastric cancer is the fourth most prevalent cancer worldwide. The ability to accurately predict surgery-related morbidity and mortality is critical in deciding both the timing of surgery and choice of surgical procedure. The aim of this study is to compare the POSSUM, p-POSSUM, o-POSSUM, and APACHE II scoring systems for predicting surgical morbidity and mortality in Chinese gastric cancer patients, as well as to create new scoring systems to achieve better prediction.MethodsData from 612 gastric cancer patients undergoing gastrectomy between January 2007 and December 2011 were included in this study. The predictive abilities of the four scoring systems were compared by examining observed-to-expected (O/E) ratios, the receiver operating characteristic curve, Student t test, and χ2 test results.ResultsThe observed complication rate of 34% (n = 208) did not differ significantly from the rate of 36.6% (n = 208) predicted by the POSSUM scoring system (O/E ratio = 0.93). The observed mortality rate was 2.9% (n = 18). For predicting mortality, POSSUM had an O/E ratio of 0.34 as compared with p-POSSUM (O/E ratio = 0.91), o-POSSUM (O/E ratio = 1.26), and APACHE II (O/E ratio = 0.28).ConclusionThe POSSUM scoring system performed well with respect to predicting morbidity risk following gastric cancer resection. For predicting postoperative mortality, p-POSSUM and o-POSSUM exhibited superior performance relative to POSSUM and APACHE II
Dynamic Tensor Decomposition via Neural Diffusion-Reaction Processes
Tensor decomposition is an important tool for multiway data analysis. In
practice, the data is often sparse yet associated with rich temporal
information. Existing methods, however, often under-use the time information
and ignore the structural knowledge within the sparsely observed tensor
entries. To overcome these limitations and to better capture the underlying
temporal structure, we propose Dynamic EMbedIngs fOr dynamic Tensor
dEcomposition (DEMOTE). We develop a neural diffusion-reaction process to
estimate dynamic embeddings for the entities in each tensor mode. Specifically,
based on the observed tensor entries, we build a multi-partite graph to encode
the correlation between the entities. We construct a graph diffusion process to
co-evolve the embedding trajectories of the correlated entities and use a
neural network to construct a reaction process for each individual entity. In
this way, our model can capture both the commonalities and personalities during
the evolution of the embeddings for different entities. We then use a neural
network to model the entry value as a nonlinear function of the embedding
trajectories. For model estimation, we combine ODE solvers to develop a
stochastic mini-batch learning algorithm. We propose a stratified sampling
method to balance the cost of processing each mini-batch so as to improve the
overall efficiency. We show the advantage of our approach in both simulation
study and real-world applications. The code is available at
https://github.com/wzhut/Dynamic-Tensor-Decomposition-via-Neural-Diffusion-Reaction-Processes
Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank
Leveraging biased click data for optimizing learning to rank systems has been
a popular approach in information retrieval. Because click data is often noisy
and biased, a variety of methods have been proposed to construct unbiased
learning to rank (ULTR) algorithms for the learning of unbiased ranking models.
Among them, automatic unbiased learning to rank (AutoULTR) algorithms that
jointly learn user bias models (i.e., propensity models) with unbiased rankers
have received a lot of attention due to their superior performance and low
deployment cost in practice. Despite their differences in theories and
algorithm design, existing studies on ULTR usually use uni-variate ranking
functions to score each document or result independently. On the other hand,
recent advances in context-aware learning-to-rank models have shown that
multivariate scoring functions, which read multiple documents together and
predict their ranking scores jointly, are more powerful than uni-variate
ranking functions in ranking tasks with human-annotated relevance labels.
Whether such superior performance would hold in ULTR with noisy data, however,
is mostly unknown. In this paper, we investigate existing multivariate scoring
functions and AutoULTR algorithms in theory and prove that permutation
invariance is a crucial factor that determines whether a context-aware
learning-to-rank model could be applied to existing AutoULTR framework. Our
experiments with synthetic clicks on two large-scale benchmark datasets show
that AutoULTR models with permutation-invariant multivariate scoring functions
significantly outperform those with uni-variate scoring functions and
permutation-variant multivariate scoring functions.Comment: 4 pages, 2 figures. It has already been accepted and will show in
Proceedings of the 29th ACM International Conference on Information and
Knowledge Management (CIKM '20), October 19--23, 202
Parton Labeling without Matching: Unveiling Emergent Labelling Capabilities in Regression Models
Parton labeling methods are widely used when reconstructing collider events
with top quarks or other massive particles. State-of-the-art techniques are
based on machine learning and require training data with events that have been
matched using simulations with truth information. In nature, there is no unique
matching between partons and final state objects due to the properties of the
strong force and due to acceptance effects. We propose a new approach to parton
labeling that circumvents these challenges by recycling regression models. The
final state objects that are most relevant for a regression model to predict
the properties of a particular top quark are assigned to said parent particle
without having any parton-matched training data. This approach is demonstrated
using simulated events with top quarks and outperforms the widely-used
method.Comment: 6 pages, 4 figure
A Reinforcement Learning Framework for Time-Dependent Causal Effects Evaluation in A/B Testing
A/B testing, or online experiment is a standard business strategy to compare
a new product with an old one in pharmaceutical, technological, and traditional
industries. Major challenges arise in online experiments where there is only
one unit that receives a sequence of treatments over time. In those
experiments, the treatment at a given time impacts current outcome as well as
future outcomes. The aim of this paper is to introduce a reinforcement learning
framework for carrying A/B testing, while characterizing the long-term
treatment effects. Our proposed testing procedure allows for sequential
monitoring and online updating, so it is generally applicable to a variety of
treatment designs in different industries. In addition, we systematically
investigate the theoretical properties (e.g., asymptotic distribution and
power) of our testing procedure. Finally, we apply our framework to both
synthetic datasets and a real-world data example obtained from a ride-sharing
company to illustrate its usefulness
Fine-tuning Large Language Models for Domain-specific Machine Translation
Large language models (LLMs) have made significant progress in machine
translation (MT). However, their potential in domain-specific MT remains
under-explored. Current LLM-based MT systems still face several challenges.
First, for LLMs with in-context learning, their effectiveness is highly
sensitive to input translation examples, and processing them can increase
inference costs. They often require extra post-processing due to
over-generation. Second, LLMs with fine-tuning on domain-specific data often
require high training costs for domain adaptation, and may weaken the zero-shot
MT capabilities of LLMs due to over-specialization. The aforementioned methods
can struggle to translate rare words in domain transfer scenarios. To address
these challenges, this paper proposes a prompt-oriented fine-tuning method,
denoted as LlamaIT, to effectively and efficiently fine-tune a general-purpose
LLM for domain-specific MT tasks. First, we construct a task-specific
mix-domain dataset, which is then used to fine-tune the LLM with LoRA. This can
eliminate the need for input translation examples, post-processing, or
over-specialization. By zero-shot prompting with instructions, we adapt the MT
tasks to the target domain at inference time. To further elicit the MT
capability for rare words, we construct new prompts by incorporating
domain-specific bilingual vocabulary. We also conduct extensive experiments on
both publicly available and self-constructed datasets. The results show that
our LlamaIT can significantly enhance the domain-specific MT capabilities of
the LLM, meanwhile preserving its zero-shot MT capabilities.Comment: 9 pages, 6 figures, 6table
Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework
A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments of two-sided marketplace platforms (e.g., Uber) where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. The aim of this article is to introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., size and power) of our testing procedure. Finally, we apply our framework to both simulated data and a real-world data example obtained from a technological company to illustrate its advantage over the current practice. A Python implementation of our test is available at https://github.com/callmespring/CausalRL. Supplementary materials for this article are available online
Manipulating refractive index, homogeneity and spectroscopy of Yb-doped silica-core glass towards high-power large mode area photonic crystal fiber lasers
Output power scaling of single mode large mode area (LMA) photonic crystal fiber (PCF) amplifiers urgently requires the low refractive index of Yb³⁺-doped silica glasses whilst maintaining high optical homogeneity. In this paper, we report on a promising alternative Yb³⁺/Al³⁺/F¯/P⁵⁺-co-doped silica core-glass (YAFP), which is prepared by modified sol-gel method developed by our group and highly suitable for fabricating high power LMA PCF amplifiers. By controlling the doping combinations of Al³⁺/F¯/P⁵⁺ in Yb³⁺- doped silica glass,it not only ensures low refractive index (RI) but also maintains the excellent optical homogeneity and spectroscopic properties of Yb³⁺. The spectroscopic properties of Yb³⁺ ions have not deteriorated by the co-doping of F¯ and P⁵⁺ in YAFP glass compared with that of Yb³⁺/Al³⁺ co-doped silica glass. A large-size (⌀5 mm × 90 mm) YAFP silica-core glass rod with low average RI difference of 2.6 × 10¯⁴ (with respect to pure silica glass), and low radial and axial RI fluctuations of ~2 × 10¯⁴, was prepared. A LMA PCF with 50 μm core diameter was obtained by stack-capillary-draw techniques using YAFP core glass. Its core NA is 0.027. An average amplified power of 97 W peaking at 1030 nm and light-light efficiency of 54% are achieved from a 6.5 m long PCF in the pulse amplification laser experiment. Meanwhile, quasi-single-mode transmission is obtained with laser beam quality factor M² of 1.4
- …