595 research outputs found
Hi, how can I help you?: Automating enterprise IT support help desks
Question answering is one of the primary challenges of natural language
understanding. In realizing such a system, providing complex long answers to
questions is a challenging task as opposed to factoid answering as the former
needs context disambiguation. The different methods explored in the literature
can be broadly classified into three categories namely: 1) classification
based, 2) knowledge graph based and 3) retrieval based. Individually, none of
them address the need of an enterprise wide assistance system for an IT support
and maintenance domain. In this domain the variance of answers is large ranging
from factoid to structured operating procedures; the knowledge is present
across heterogeneous data sources like application specific documentation,
ticket management systems and any single technique for a general purpose
assistance is unable to scale for such a landscape. To address this, we have
built a cognitive platform with capabilities adopted for this domain. Further,
we have built a general purpose question answering system leveraging the
platform that can be instantiated for multiple products, technologies in the
support domain. The system uses a novel hybrid answering model that
orchestrates across a deep learning classifier, a knowledge graph based context
disambiguation module and a sophisticated bag-of-words search system. This
orchestration performs context switching for a provided question and also does
a smooth hand-off of the question to a human expert if none of the automated
techniques can provide a confident answer. This system has been deployed across
675 internal enterprise IT support and maintenance projects.Comment: To appear in IAAI 201
Audio-visual child-adult speaker classification in dyadic interactions
Interactions involving children span a wide range of important domains from
learning to clinical diagnostic and therapeutic contexts. Automated analyses of
such interactions are motivated by the need to seek accurate insights and offer
scale and robustness across diverse and wide-ranging conditions. Identifying
the speech segments belonging to the child is a critical step in such modeling.
Conventional child-adult speaker classification typically relies on audio
modeling approaches, overlooking visual signals that convey speech articulation
information, such as lip motion. Building on the foundation of an audio-only
child-adult speaker classification pipeline, we propose incorporating visual
cues through active speaker detection and visual processing models. Our
framework involves video pre-processing, utterance-level child-adult speaker
detection, and late fusion of modality-specific predictions. We demonstrate
from extensive experiments that a visually aided classification pipeline
enhances the accuracy and robustness of the classification. We show relative
improvements of 2.38% and 3.97% in F1 macro score when one face and two faces
are visible, respectively.Comment: In review for ICASSP 2024, 5 page
Inducing Language Networks from Continuous Space Word Representations
Recent advancements in unsupervised feature learning have developed powerful
latent representations of words. However, it is still not clear what makes one
representation better than another and how we can learn the ideal
representation. Understanding the structure of latent spaces attained is key to
any future advancement in unsupervised learning. In this work, we introduce a
new view of continuous space word representations as language networks. We
explore two techniques to create language networks from learned features by
inducing them for two popular word representation methods and examining the
properties of their resulting networks. We find that the induced networks
differ from other methods of creating language networks, and that they contain
meaningful community structure.Comment: 14 page
Information extraction from multimedia web documents: an open-source platform and testbed
The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval
- ā¦