286 research outputs found

    Facial Motion Prior Networks for Facial Expression Recognition

    Full text link
    Deep learning based facial expression recognition (FER) has received a lot of attention in the past few years. Most of the existing deep learning based FER methods do not consider domain knowledge well, which thereby fail to extract representative features. In this work, we propose a novel FER framework, named Facial Motion Prior Networks (FMPN). Particularly, we introduce an addition branch to generate a facial mask so as to focus on facial muscle moving regions. To guide the facial mask learning, we propose to incorporate prior domain knowledge by using the average differences between neutral faces and the corresponding expressive faces as the training guidance. Extensive experiments on three facial expression benchmark datasets demonstrate the effectiveness of the proposed method, compared with the state-of-the-art approaches.Comment: VCIP 2019, Oral. Code is available at https://github.com/donydchen/FMPN-FE

    Evaluation of the POSSUM, p-POSSUM, o-POSSUM, and APACHE II scoring systems in predicting postoperative mortality and morbidity in gastric cancer patients

    Get PDF
    SummaryBackground/ObjectiveGastric cancer is the fourth most prevalent cancer worldwide. The ability to accurately predict surgery-related morbidity and mortality is critical in deciding both the timing of surgery and choice of surgical procedure. The aim of this study is to compare the POSSUM, p-POSSUM, o-POSSUM, and APACHE II scoring systems for predicting surgical morbidity and mortality in Chinese gastric cancer patients, as well as to create new scoring systems to achieve better prediction.MethodsData from 612 gastric cancer patients undergoing gastrectomy between January 2007 and December 2011 were included in this study. The predictive abilities of the four scoring systems were compared by examining observed-to-expected (O/E) ratios, the receiver operating characteristic curve, Student t test, and χ2 test results.ResultsThe observed complication rate of 34% (n = 208) did not differ significantly from the rate of 36.6% (n = 208) predicted by the POSSUM scoring system (O/E ratio = 0.93). The observed mortality rate was 2.9% (n = 18). For predicting mortality, POSSUM had an O/E ratio of 0.34 as compared with p-POSSUM (O/E ratio = 0.91), o-POSSUM (O/E ratio = 1.26), and APACHE II (O/E ratio = 0.28).ConclusionThe POSSUM scoring system performed well with respect to predicting morbidity risk following gastric cancer resection. For predicting postoperative mortality, p-POSSUM and o-POSSUM exhibited superior performance relative to POSSUM and APACHE II

    Dynamic Tensor Decomposition via Neural Diffusion-Reaction Processes

    Full text link
    Tensor decomposition is an important tool for multiway data analysis. In practice, the data is often sparse yet associated with rich temporal information. Existing methods, however, often under-use the time information and ignore the structural knowledge within the sparsely observed tensor entries. To overcome these limitations and to better capture the underlying temporal structure, we propose Dynamic EMbedIngs fOr dynamic Tensor dEcomposition (DEMOTE). We develop a neural diffusion-reaction process to estimate dynamic embeddings for the entities in each tensor mode. Specifically, based on the observed tensor entries, we build a multi-partite graph to encode the correlation between the entities. We construct a graph diffusion process to co-evolve the embedding trajectories of the correlated entities and use a neural network to construct a reaction process for each individual entity. In this way, our model can capture both the commonalities and personalities during the evolution of the embeddings for different entities. We then use a neural network to model the entry value as a nonlinear function of the embedding trajectories. For model estimation, we combine ODE solvers to develop a stochastic mini-batch learning algorithm. We propose a stratified sampling method to balance the cost of processing each mini-batch so as to improve the overall efficiency. We show the advantage of our approach in both simulation study and real-world applications. The code is available at https://github.com/wzhut/Dynamic-Tensor-Decomposition-via-Neural-Diffusion-Reaction-Processes

    Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank

    Full text link
    Leveraging biased click data for optimizing learning to rank systems has been a popular approach in information retrieval. Because click data is often noisy and biased, a variety of methods have been proposed to construct unbiased learning to rank (ULTR) algorithms for the learning of unbiased ranking models. Among them, automatic unbiased learning to rank (AutoULTR) algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their differences in theories and algorithm design, existing studies on ULTR usually use uni-variate ranking functions to score each document or result independently. On the other hand, recent advances in context-aware learning-to-rank models have shown that multivariate scoring functions, which read multiple documents together and predict their ranking scores jointly, are more powerful than uni-variate ranking functions in ranking tasks with human-annotated relevance labels. Whether such superior performance would hold in ULTR with noisy data, however, is mostly unknown. In this paper, we investigate existing multivariate scoring functions and AutoULTR algorithms in theory and prove that permutation invariance is a crucial factor that determines whether a context-aware learning-to-rank model could be applied to existing AutoULTR framework. Our experiments with synthetic clicks on two large-scale benchmark datasets show that AutoULTR models with permutation-invariant multivariate scoring functions significantly outperform those with uni-variate scoring functions and permutation-variant multivariate scoring functions.Comment: 4 pages, 2 figures. It has already been accepted and will show in Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20), October 19--23, 202

    Parton Labeling without Matching: Unveiling Emergent Labelling Capabilities in Regression Models

    Full text link
    Parton labeling methods are widely used when reconstructing collider events with top quarks or other massive particles. State-of-the-art techniques are based on machine learning and require training data with events that have been matched using simulations with truth information. In nature, there is no unique matching between partons and final state objects due to the properties of the strong force and due to acceptance effects. We propose a new approach to parton labeling that circumvents these challenges by recycling regression models. The final state objects that are most relevant for a regression model to predict the properties of a particular top quark are assigned to said parent particle without having any parton-matched training data. This approach is demonstrated using simulated events with top quarks and outperforms the widely-used χ2\chi^2 method.Comment: 6 pages, 4 figure

    A Reinforcement Learning Framework for Time-Dependent Causal Effects Evaluation in A/B Testing

    Full text link
    A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. The aim of this paper is to introduce a reinforcement learning framework for carrying A/B testing, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating, so it is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., asymptotic distribution and power) of our testing procedure. Finally, we apply our framework to both synthetic datasets and a real-world data example obtained from a ride-sharing company to illustrate its usefulness

    Fine-tuning Large Language Models for Domain-specific Machine Translation

    Full text link
    Large language models (LLMs) have made significant progress in machine translation (MT). However, their potential in domain-specific MT remains under-explored. Current LLM-based MT systems still face several challenges. First, for LLMs with in-context learning, their effectiveness is highly sensitive to input translation examples, and processing them can increase inference costs. They often require extra post-processing due to over-generation. Second, LLMs with fine-tuning on domain-specific data often require high training costs for domain adaptation, and may weaken the zero-shot MT capabilities of LLMs due to over-specialization. The aforementioned methods can struggle to translate rare words in domain transfer scenarios. To address these challenges, this paper proposes a prompt-oriented fine-tuning method, denoted as LlamaIT, to effectively and efficiently fine-tune a general-purpose LLM for domain-specific MT tasks. First, we construct a task-specific mix-domain dataset, which is then used to fine-tune the LLM with LoRA. This can eliminate the need for input translation examples, post-processing, or over-specialization. By zero-shot prompting with instructions, we adapt the MT tasks to the target domain at inference time. To further elicit the MT capability for rare words, we construct new prompts by incorporating domain-specific bilingual vocabulary. We also conduct extensive experiments on both publicly available and self-constructed datasets. The results show that our LlamaIT can significantly enhance the domain-specific MT capabilities of the LLM, meanwhile preserving its zero-shot MT capabilities.Comment: 9 pages, 6 figures, 6table

    Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework

    Get PDF
    A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments of two-sided marketplace platforms (e.g., Uber) where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. The aim of this article is to introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., size and power) of our testing procedure. Finally, we apply our framework to both simulated data and a real-world data example obtained from a technological company to illustrate its advantage over the current practice. A Python implementation of our test is available at https://github.com/callmespring/CausalRL. Supplementary materials for this article are available online

    Manipulating refractive index, homogeneity and spectroscopy of Yb3+^{3+}-doped silica-core glass towards high-power large mode area photonic crystal fiber lasers

    Get PDF
    Output power scaling of single mode large mode area (LMA) photonic crystal fiber (PCF) amplifiers urgently requires the low refractive index of Yb³⁺-doped silica glasses whilst maintaining high optical homogeneity. In this paper, we report on a promising alternative Yb³⁺/Al³⁺/F¯/P⁵⁺-co-doped silica core-glass (YAFP), which is prepared by modified sol-gel method developed by our group and highly suitable for fabricating high power LMA PCF amplifiers. By controlling the doping combinations of Al³⁺/F¯/P⁵⁺ in Yb³⁺- doped silica glass,it not only ensures low refractive index (RI) but also maintains the excellent optical homogeneity and spectroscopic properties of Yb³⁺. The spectroscopic properties of Yb³⁺ ions have not deteriorated by the co-doping of F¯ and P⁵⁺ in YAFP glass compared with that of Yb³⁺/Al³⁺ co-doped silica glass. A large-size (⌀5 mm × 90 mm) YAFP silica-core glass rod with low average RI difference of 2.6 × 10¯⁴ (with respect to pure silica glass), and low radial and axial RI fluctuations of ~2 × 10¯⁴, was prepared. A LMA PCF with 50 μm core diameter was obtained by stack-capillary-draw techniques using YAFP core glass. Its core NA is 0.027. An average amplified power of 97 W peaking at 1030 nm and light-light efficiency of 54% are achieved from a 6.5 m long PCF in the pulse amplification laser experiment. Meanwhile, quasi-single-mode transmission is obtained with laser beam quality factor M² of 1.4
    corecore