26 research outputs found
Deep learning for spoken dialogue systems : application to nutrition
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from PDF version of thesis.Includes bibliographical references (pages 207-221).Personal digital assistants such as Siri, Cortana, and Alexa must translate a user's natural language query into a semantic representation that the back-end can then use to retrieve information from relevant data sources. For example, answering a user's question about the number of calories in a food requires querying a database with nutrition facts for various foods. In this thesis, we demonstrate deep learning techniques for performing a semantic mapping from raw, unstructured, human natural language directly to a structured, relational database, without any intermediate pre-processing steps or string matching heuristics. Specifically, we show that a novel, weakly supervised convolutional neural architecture learns a shared latent space, where vector representations of natural language queries lie close to embeddings of database entries that have semantically similar meanings. The first instantiation of this technology is in the nutrition domain, with the goal of reducing the burden on individuals monitoring their food intake to support healthy eating or manage their weight. To train the models, we collected 31,712 written and 2,962 spoken meal descriptions that were weakly annotated with only information about which database foods were described in the meal, but not explicitly where they were mentioned. Our best deep learning models achieve 95.8% average semantic tagging F1 score on a held-out test set of spoken meal descriptions, and 97.1% top-5 food database recall in a fully deployed iOS application. We also observed a significant correlation between data logged by our system and that recorded during a 24-hour dietary recall conducted by expert nutritionists in a pilot study with 14 participants. Finally, we show that our approach generalizes beyond nutrition and database mapping to other tasks such as dialogue state tracking.by Mandy Barrett Korpusik.Ph. D.Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc
Metrics and similarities in modeling dependencies between continuous and nominal data
Classification theory analytical paradigm investigates continuous data only. When we deal with
a mix of continuous and nominal attributes in data records, difficulties emerge. Usually, the
analytical paradigm treats nominal attributes as continuous ones via numerical coding of nominal
values (often a bit ad hoc). We propose a way of keeping nominal values within analytical
paradigm with no pretending that nominal values are continuous. The core idea is that the
information hidden in nominal values influences on metric (or on similarity function) between
records of continuous and nominal data. Adaptation finds relevant parameters which influence
metric between data records. Our approach works well for classifier induction algorithms
where metric or similarity is generic, for instance k nearest neighbor algorithm or proposed
here support of decision tree induction by similarity function between data. The k-nn algorithm
working with continuous and nominal data behaves considerably better, when nominal
values are processed by our approach. Algorithms of analytical paradigm using linear and
probability machinery, like discriminant adaptive nearest-neighbor or Fisher’s linear discriminant
analysis, cause some difficulties. We propose some possible ways to overcome these obstacles
for adaptive nearest neighbor algorithm
Spoken language understanding in a nutrition dialogue system
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.Cataloged from PDF version of thesis.Includes bibliographical references (pages 105-111).Existing approaches for the prevention and treatment of obesity are hampered by the lack of accurate, low-burden methods for self-assessment of food intake, especially for hard-to-reach, low-literate populations. For this reason, we propose a novel approach to diet tracking that utilizes speech understanding and dialogue technology in order to enable efficient self-assessment of energy and nutrient consumption. We are interested in studying whether speech can lower user workload compared to existing self-assessment methods, whether spoken language descriptions of meals can accurately quantify caloric and nutrient absorption, and whether dialogue can efficiently and effectively be used to ascertain and clarify food properties, perhaps in conjunction with other modalities. In this thesis, we explore the core innovation of our nutrition system: the language understanding component which relies on machine learning methods to automatically detect food concepts in a user's spoken meal description. In particular, we investigate the performance of conditional random field (CRF) models for semantic labeling and segmentation of spoken meal descriptions. On a corpus of 10,000 meal descriptions, we achieve an average F1 test score of 90.7 for semantic tagging and 86.3 for associating foods with properties. In a study of users interacting with an initial prototype of the system, semantic tagging achieved an accuracy of 83%, which was sufficiently high to satisfy users.by Mandy B. Korpusik.S.M