25 research outputs found
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents
Today, most large-scale conversational AI agents (e.g. Alexa, Siri, or Google
Assistant) are built using manually annotated data to train the different
components of the system. Typically, the accuracy of the ML models in these
components are improved by manually transcribing and annotating data. As the
scope of these systems increase to cover more scenarios and domains, manual
annotation to improve the accuracy of these components becomes prohibitively
costly and time consuming. In this paper, we propose a system that leverages
user-system interaction feedback signals to automate learning without any
manual annotation. Users here tend to modify a previous query in hopes of
fixing an error in the previous turn to get the right results. These
reformulations, which are often preceded by defective experiences caused by
errors in ASR, NLU, ER or the application. In some cases, users may not
properly formulate their requests (e.g. providing partial title of a song), but
gleaning across a wider pool of users and sessions reveals the underlying
recurrent patterns. Our proposed self-learning system automatically detects the
errors, generate reformulations and deploys fixes to the runtime system to
correct different types of errors occurring in different components of the
system. In particular, we propose leveraging an absorbing Markov Chain model as
a collaborative filtering mechanism in a novel attempt to mine these patterns.
We show that our approach is highly scalable, and able to learn reformulations
that reduce Alexa-user errors by pooling anonymized data across millions of
customers. The proposed self-learning system achieves a win/loss ratio of 11.8
and effectively reduces the defect rate by more than 30% on utterance level
reformulations in our production A/B tests. To the best of our knowledge, this
is the first self-learning large-scale conversational AI system in production.Comment: 8 pages, 2 figure
Knowledge Distillation from Internal Representations
Knowledge distillation is typically conducted by training a small model (the
student) to mimic a large and cumbersome model (the teacher). The idea is to
compress the knowledge from the teacher by using its output probabilities as
soft-labels to optimize the student. However, when the teacher is considerably
large, there is no guarantee that the internal knowledge of the teacher will be
transferred into the student; even if the student closely matches the
soft-labels, its internal representations may be considerably different. This
internal mismatch can undermine the generalization capabilities originally
intended to be transferred from the teacher to the student. In this paper, we
propose to distill the internal representations of a large model such as BERT
into a simplified version of it. We formulate two ways to distill such
representations and various algorithms to conduct the distillation. We
experiment with datasets from the GLUE benchmark and consistently show that
adding knowledge distillation from internal representations is a more powerful
method than only using soft-label distillation.Comment: To appear in AAAI-202
Impact of geographic diversity on citation of collaborative research
Diversity in human capital is widely seen as critical to creating holistic and high-quality research, especially in areas that engage with diverse cultures, environments, and challenges. Quantification of diverse academic collaborations and their effect on research quality is lacking, especially at international scale and across different domains. Here, we present the first effort to measure the impact of geographic diversity in coauthorships on the citation of their papers across different academic domains. Our results unequivocally show that geographic coauthor diversity improves paper citation, but very long distance collaborations have variable impact. We also discover “well-trodden” collaboration circles that yield much less impact than similar travel distances. These relationships are observed to exist across different subject areas, but with varying strengths. These findings can help academics identify new opportunities from a diversity perspective, as well as inform funders on areas that require additional mobility support
The Function of MoGlk1 in Integration of Glucose and Ammonium Utilization in Magnaporthe oryzae
Hexokinases are conserved proteins functioning in glucose sensing and signaling. The rice blast fungus Magnaporthe oryzae contains several hexokinases, including MoHxk1 (hexokinase) and MoGlk1 (glucokinase) encoded respectively by MoHXK1 and MoGLK1 genes. The heterologous expression of MoGlk1 and MoHxk1 in Saccharomyces cerevisiae confirmed their conserved functions. Disruption of MoHXK1 resulted in growth reduction in medium containing fructose as the sole carbon source, whereas disruption of MoGLK1 did not cause the similar defect. However, the ΔMoglk1 mutant displayed decreased proton extrusion and a lower biomass in the presence of ammonium, suggesting a decline in the utilization of ammonium. Additionally, the MoGLK1 allele lacking catalytic activity restored growth to the ΔMoglk1 mutant. Moreover, the expression of MoPMA1 encoding a plasma membrane H+-ATPase decreased in the ΔMoglk1 mutant that can be suppressed by glucose and G-6-P. Thus, MoGlk1, but not MoHxk1, regulates ammonium utilization through a mechanism that is independent from its catalytic activity
Clinicopathological Features and Prognosis of Papillary Thyroid Microcarcinoma for Surgery and Relationships with the BRAF<sup>V600E</sup> Mutational Status and Expression of Angiogenic Factors
<div><p>Objective</p><p>To investigate the clinicopathological characteristics of papillary thyroid microcarcinoma (PTMC) for surgery by comparing the difference between PTMC and larger papillary thyroid carcinoma (LPTC).</p><p>Methods</p><p>We analyzed the differences in the clinicopathological characteristics, prognosis, B-type RAF kinase (BRAF)<sup>V600E</sup> mutational status and expression of angiogenic factors, including pigment epithelium-derived factor (PEDF), Vascular Endothelial Growth Factor (VEGF), and hypoxia-inducible factor alpha subunit (HIF-1α), between PTMC and LPTC by retrospectively reviewing the records of 251 patients with papillary thyroid carcinoma, 169 with PTMC, and 82 with LPTC (diameter >1 cm).</p><p>Results</p><p>There were no significant differences in the gender, age, multifocality, Hashimoto’s thyroiditis, TNM stage, PEDF protein expression, rate of recurrence, or mean follow-up duration between patients with PTMC or LPTC. The prevalence of extrathyroidal invasion (EI), lymph node metastasis (LNM), and BRAF mutation in patients with PTMC was significantly lower than in patients with LPTC. In addition, in PTMC patients with EI and/or LNM and/or positive BRAF (high-risk PTMC patients), the prevalence of extrathyroidal invasion, Hashimoto's disease, lymph node metastasis, tumor TNM stage, PEDF positive protein expression, the rate of recurrent disease, and the mRNA expression of anti-angiogenic factors was almost as high as in patients with larger PTC, but with no significant difference.</p><p>Conclusions</p><p>Extrathyroid invasion, lymph node metastases, and BRAF<sup>V600E</sup> mutation were the high risk factors of PTMC. PTMC should be considered for the same treatment strategy as LPTC when any of these factors is found. Particularly, PTMC with BRAF<sup>V600E</sup> gene mutations needed earlier surgical treatment. In addition, the high cell subtype of PTMC with BRAF<sup>V600E</sup> gene mutation is recommended for total thyroidectomy in primary surgery to reduce the risk of recurrence.</p></div