9,933 research outputs found
Structural Embedding of Syntactic Trees for Machine Comprehension
Deep neural networks for machine comprehension typically utilizes only word
or character embeddings without explicitly taking advantage of structured
linguistic information such as constituency trees and dependency trees. In this
paper, we propose structural embedding of syntactic trees (SEST), an algorithm
framework to utilize structured information and encode them into vector
representations that can boost the performance of algorithms for the machine
comprehension. We evaluate our approach using a state-of-the-art neural
attention model on the SQuAD dataset. Experimental results demonstrate that our
model can accurately identify the syntactic boundaries of the sentences and
extract answers that are syntactically coherent over the baseline methods
Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa
This paper presents a generic Bayesian framework that enables any deep
learning model to actively learn from targeted crowds. Our framework inherits
from recent advances in Bayesian deep learning, and extends existing work by
considering the targeted crowdsourcing approach, where multiple annotators with
unknown expertise contribute an uncontrolled amount (often limited) of
annotations. Our framework leverages the low-rank structure in annotations to
learn individual annotator expertise, which then helps to infer the true labels
from noisy and sparse annotations. It provides a unified Bayesian model to
simultaneously infer the true labels and train the deep learning model in order
to reach an optimal learning efficacy. Finally, our framework exploits the
uncertainty of the deep learning model during prediction as well as the
annotators' estimated expertise to minimize the number of required annotations
and annotators for optimally training the deep learning model.
We evaluate the effectiveness of our framework for intent classification in
Alexa (Amazon's personal assistant), using both synthetic and real-world
datasets. Experiments show that our framework can accurately learn annotator
expertise, infer true labels, and effectively reduce the amount of annotations
in model training as compared to state-of-the-art approaches. We further
discuss the potential of our proposed framework in bridging machine learning
and crowdsourcing towards improved human-in-the-loop systems
Interpretation of Natural Language Rules in Conversational Machine Reading
Most work in machine reading focuses on question answering problems where the
answer is directly expressed in the text to read. However, many real-world
question answering problems require the reading of text not because it contains
the literal answer, but because it contains a recipe to derive an answer
together with the reader's background knowledge. One example is the task of
interpreting regulations to answer "Can I...?" or "Do I have to...?" questions
such as "I am working in Canada. Do I have to carry on paying UK National
Insurance?" after reading a UK government website about this topic. This task
requires both the interpretation of rules and the application of background
knowledge. It is further complicated due to the fact that, in practice, most
questions are underspecified, and a human assistant will regularly have to ask
clarification questions such as "How long have you been working abroad?" when
the answer cannot be directly derived from the question and text. In this
paper, we formalise this task and develop a crowd-sourcing strategy to collect
32k task instances based on real-world rules and crowd-generated questions and
scenarios. We analyse the challenges of this task and assess its difficulty by
evaluating the performance of rule-based and machine-learning baselines. We
observe promising results when no background knowledge is necessary, and
substantial room for improvement whenever background knowledge is needed.Comment: EMNLP 201
SISTEM TANYA JAWAB DENGAN ANOTASI PREDIKTIF DI UNIVERSITAS TELKOM
Kondisi teknologi sekarang ini yang sudah berkembang pesat menyebabkan banyak terpengaruhnya sistem kerja yang selama ini banyak dilakukan secara konvesional. Tidak dipungkiri hal tersebut dapat menggantikan manusia dalam melakukan beberapa pekerjaan, diantaranya dalam berkomunikasi untuk mendapatkan informasi tentang suatu hal atau biasa disebut dengan aktifitas tanya jawab. Sistem tanya jawab ini dapat mempermudah pencarian informasi terutama di tempat fasilitas umum yang dalam penelitian ini penulis mengambil contoh kampus. Dalam pencarian informasi pun muncul masalah saat mencari informasi atau bertanya namun tidak sesuai kata kunci yang ditanyakan sehingga terkadang maksud dari pertanyaan atau kata kunci sedikit tidak sesuai.
Berdasarkan masalah diatas penulis merancang sebuah question answering system by predictive annotation (sistem tanya jawab dengan anotasi prediktif). Sistem yang akan dirancang berdasarkan natural language processing (pengolahan bahasa alami). Dimana sistem ini dapat melakukan proses pengolahan bahasa alami sehingga memudahkan komunikasi antara manusia dan komputer. Penelitian ini akan menerapkan natural language processing kepada suatu question answering system dengan anotasi prediktif yang diterapkan pada speech-to-speech system. Maka pengguna hanya perlu mengucapkan kalimat tanya sesuai pertanyaan seputar kampus. Anotasi prediktif pun dapat melakukan prediksi maksud pertanyaan menggunakan answer selection dan hit list ranking melihat dari kata kunci yang tidak lengkap atau sedikit tidak sesuai, memprediksi pertanyaan serupa serta memberikan sugesti pertanyaan apabila kata kunci tidak sesuai.
Hasil yang didapat dengan menggunakan anotasi prediktif yang mencari jawaban berdasarkan kata kunci kata tanya, subjek dan objek dapat menghasilkan jawaban sesuai juga dapat memberikan saran pertanyaan serupa dan memberikan saran pebaikan pertanyaan serta jawabannya jika pertanyaan dianggap kurang jelas.
Kata Kunci : Natural Language Processing (NLP), Predictive Annotation, Question Answering System, Speech-to-speech Syste
- …