Search CORE

1,746 research outputs found

COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks

Author: Bahdanau Dzmitry
Diederik
Hakkani-Tür Dilek
Ioffe Sergey
Liang Chen
McCulloh Ian
Rocktäschel Tim
Sarikaya R.
Sutskever Ilya
van der Maaten Laurens
Zhang Xiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2018
Field of study

For a company looking to provide delightful user experiences, it is of paramount importance to take care of any customer issues. This paper proposes COTA, a system to improve speed and reliability of customer support for end users through automated ticket classification and answers selection for support representatives. Two machine learning and natural language processing techniques are demonstrated: one relying on feature engineering (COTA v1) and the other exploiting raw signals through deep learning architectures (COTA v2). COTA v1 employs a new approach that converts the multi-classification task into a ranking problem, demonstrating significantly better performance in the case of thousands of classes. For COTA v2, we propose an Encoder-Combiner-Decoder, a novel deep learning architecture that allows for heterogeneous input and output feature types and injection of prior knowledge through network architecture choices. This paper compares these models and their variants on the task of ticket classification and answer selection, showing model COTA v2 outperforms COTA v1, and analyzes their inner workings and shortcomings. Finally, an A/B test is conducted in a production setting validating the real-world impact of COTA in reducing issue resolution time by 10 percent without reducing customer satisfaction

arXiv.org e-Print Archive

Crossref

Confusion Modelling - An Estimation by Semantic Embeddings

Author: Mohanprasad Praveen
Publication venue: Technological University Dublin
Publication date: 01/01/2020
Field of study

Approaching the task of coherence assessment of a conversation from its negative perspective ‘confusion’ rather than coherence itself, has been attempted by very few research works. Influencing Embeddings to learn from similarity/dissimilarity measures such as distance, cosine similarity between two utterances will equip them with the semantics to differentiate a coherent and an incoherent conversation through the detection of negative entity, ‘confusion’. This research attempts to measure coherence of conversation between a human and a conversational agent by means of such semantic embeddings trained from scratch by an architecture centralising the learning from the distance between the embeddings. State of the art performance of general BERT’s embeddings and state of the art performance of ConveRT’s conversation specific embeddings in addition to the GLOVE embeddings are also tested upon the laid architecture. Confusion, being a more sensible entity, real human labelling performance is set as the baseline to evaluate the models. The base design resulted in not such a good performance against the human score but the pre-trained embeddings when plugged into the base architecture had performance boosts in a particular order from lowest to highest, through BERT, GLOVE and ConveRT. The intuition and the efficiency of the base conceptual design is proved of its success when the variant having the ConveRT embeddings plugged into the base design, outperformed the original ConveRT’s state of art performance on generating similarity scores. Though a performance comparable to real human performance was not achieved by the models, there witnessed a considerable overlapping between the ConveRT variant and the human scores which is really a great positive inference to be enjoyed as achieving human performance is always the state of art in any research domain. Also, from the results, this research joins the group of works claiming BERT to be unsuitable for conversation specific modelling and embedding works

Arrow@TUDublin

Meta learning with language models: Challenges and opportunities in the classification of imbalanced text

Author: Hasan Munawar
Jin Honglan
Vassilev Apostol
Publication venue
Publication date: 24/10/2023
Field of study

Detecting out of policy speech (OOPS) content is important but difficult. While machine learning is a powerful tool to tackle this challenging task, it is hard to break the performance ceiling due to factors like quantity and quality limitations on training data and inconsistencies in OOPS definition and data labeling. To realize the full potential of available limited resources, we propose a meta learning technique (MLT) that combines individual models built with different text representations. We analytically show that the resulting technique is numerically stable and produces reasonable combining weights. We combine the MLT with a threshold-moving (TM) technique to further improve the performance of the combined predictor on highly-imbalanced in-distribution and out-of-distribution datasets. We also provide computational results to show the statistically significant advantages of the proposed MLT approach. All authors contributed equally to this work.Comment: 22 pages, including 5 figures, 12 tables, 1 appendi

arXiv.org e-Print Archive

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Author: Angelova Anelia
Gomes Victor
Kim Dahun
Noble Isaac
Piergiovanni AJ
Ryoo Michael S.
Publication venue
Publication date: 13/11/2023
Field of study

One of the main challenges of multimodal learning is the need to combine heterogeneous modalities (e.g., video, audio, text). For example, video and audio are obtained at much higher rates than text and are roughly aligned in time. They are often not synchronized with text, which comes as a global context, e.g., a title, or a description. Furthermore, video and audio inputs are of much larger volumes, and grow as the video length increases, which naturally requires more compute dedicated to these modalities and makes modeling of long-range dependencies harder. We here decouple the multimodal modeling, dividing it into separate, focused autoregressive models, processing the inputs according to the characteristics of the modalities. We propose a multimodal model, called Mirasol3B, consisting of an autoregressive component for the time-synchronized modalities (audio and video), and an autoregressive component for the context modalities which are not necessarily aligned in time but are still sequential. To address the long-sequences of the video-audio inputs, we propose to further partition the video and audio sequences in consecutive snippets and autoregressively process their representations. To that end, we propose a Combiner mechanism, which models the audio-video information jointly within a timeframe. The Combiner learns to extract audio and video features from raw spatio-temporal signals, and then learns to fuse these features producing compact but expressive representations per snippet. Our approach achieves the state-of-the-art on well established multimodal benchmarks, outperforming much larger models. It effectively addresses the high computational demand of media inputs by both learning compact representations, controlling the sequence length of the audio-video feature representations, and modeling their dependencies in time

arXiv.org e-Print Archive

Signal processing using spectrally phase encoded optical frequency combs

Author: Delfyett Peter
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 12/08/2014
Field of study

Methods, apparatus and systems for an optical system for data harvesting and pattern recognition. The system includes a mode locked laser for producing a comb of optical frequencies that is split into two identical combs, a wavelength division demultiplexer eparate the individual optical frequency components of one comb and modulates each optical frequency component with a different one of plural target objects. A second modulator modulates an input signal with the second comb and an optical splitter splits the modulated signal into plural optical frequency components each containing the input signal. An optical combiner simultaneously combines the components containing the real time signal with one of the components containing a target object to produce a temporally modulated interferograrn, and a comparator simultaneously compares the two on a comb by comb basis using balanced differential detection to determine any of the plural target objects are found in the input signal

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Machine Learning to Predict Advertisement Targeting Solutions

Author: Garrett Zachary
Publication venue: Technical Disclosure Commons
Publication date: 12/12/2017
Field of study

Generally, the present disclosure is directed to using machine learning to predict advertisement targeting solutions. In particular, in some implementations, the systems and methods of the present disclosure can include or otherwise leverage one or more machine-learned models to predict optimal advertisement target solutions such as, for example, keyword word sets, negative word sets, location restrictions, bid adjustments, and/or schedules based on product data such as, for example, advertisement content (e.g., ad creatives text), seed keywords, images of the product, and/or advertiser metadata

Technical Disclosure Common