16,408 research outputs found
COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks
For a company looking to provide delightful user experiences, it is of
paramount importance to take care of any customer issues. This paper proposes
COTA, a system to improve speed and reliability of customer support for end
users through automated ticket classification and answers selection for support
representatives. Two machine learning and natural language processing
techniques are demonstrated: one relying on feature engineering (COTA v1) and
the other exploiting raw signals through deep learning architectures (COTA v2).
COTA v1 employs a new approach that converts the multi-classification task into
a ranking problem, demonstrating significantly better performance in the case
of thousands of classes. For COTA v2, we propose an Encoder-Combiner-Decoder, a
novel deep learning architecture that allows for heterogeneous input and output
feature types and injection of prior knowledge through network architecture
choices. This paper compares these models and their variants on the task of
ticket classification and answer selection, showing model COTA v2 outperforms
COTA v1, and analyzes their inner workings and shortcomings. Finally, an A/B
test is conducted in a production setting validating the real-world impact of
COTA in reducing issue resolution time by 10 percent without reducing customer
satisfaction
A Comparative Study on Regularization Strategies for Embedding-based Neural Networks
This paper aims to compare different regularization strategies to address a
common phenomenon, severe overfitting, in embedding-based neural networks for
NLP. We chose two widely studied neural models and tasks as our testbed. We
tried several frequently applied or newly proposed regularization strategies,
including penalizing weights (embeddings excluded), penalizing embeddings,
re-embedding words, and dropout. We also emphasized on incremental
hyperparameter tuning, and combining different regularizations. The results
provide a picture on tuning hyperparameters for neural NLP models.Comment: EMNLP '1
Polyglot: Distributed Word Representations for Multilingual NLP
Distributed word representations (word embeddings) have recently contributed
to competitive performance in language modeling and several NLP tasks. In this
work, we train word embeddings for more than 100 languages using their
corresponding Wikipedias. We quantitatively demonstrate the utility of our word
embeddings by using them as the sole features for training a part of speech
tagger for a subset of these languages. We find their performance to be
competitive with near state-of-art methods in English, Danish and Swedish.
Moreover, we investigate the semantic features captured by these embeddings
through the proximity of word groupings. We will release these embeddings
publicly to help researchers in the development and enhancement of multilingual
applications.Comment: 10 pages, 2 figures, Proceedings of Conference on Computational
Natural Language Learning CoNLL'201
Finding Answers from the Word of God: Domain Adaptation for Neural Networks in Biblical Question Answering
Question answering (QA) has significantly benefitted from deep learning
techniques in recent years. However, domain-specific QA remains a challenge due
to the significant amount of data required to train a neural network. This
paper studies the answer sentence selection task in the Bible domain and answer
questions by selecting relevant verses from the Bible. For this purpose, we
create a new dataset BibleQA based on bible trivia questions and propose three
neural network models for our task. We pre-train our models on a large-scale QA
dataset, SQuAD, and investigate the effect of transferring weights on model
accuracy. Furthermore, we also measure the model accuracies with different
answer context lengths and different Bible translations. We affirm that
transfer learning has a noticeable improvement in the model accuracy. We
achieve relatively good results with shorter context lengths, whereas longer
context lengths decreased model accuracy. We also find that using a more modern
Bible translation in the dataset has a positive effect on the task.Comment: The paper has been accepted at IJCNN 201
- …