86 research outputs found
Analysing the Masked predictive coding training criterion for pre-training a Speech Representation Model
Recent developments in pre-trained speech representation utilizing
self-supervised learning (SSL) have yielded exceptional results on a variety of
downstream tasks. One such technique, known as masked predictive coding (MPC),
has been employed by some of the most high-performing models. In this study, we
investigate the impact of MPC loss on the type of information learnt at various
layers in the HuBERT model, using nine probing tasks. Our findings indicate
that the amount of content information learned at various layers of the HuBERT
model has a positive correlation to the MPC loss. Additionally, it is also
observed that any speaker-related information learned at intermediate layers of
the model, is an indirect consequence of the learning process, and therefore
cannot be controlled using the MPC loss. These findings may serve as
inspiration for further research in the speech community, specifically in the
development of new pre-training tasks or the exploration of new pre-training
criterion's that directly preserves both speaker and content information at
various layers of a learnt model
Mind Your Language: Abuse and Offense Detection for Code-Switched Languages
In multilingual societies like the Indian subcontinent, use of code-switched
languages is much popular and convenient for the users. In this paper, we study
offense and abuse detection in the code-switched pair of Hindi and English
(i.e. Hinglish), the pair that is the most spoken. The task is made difficult
due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish
language. We apply transfer learning and make a LSTM based model for hate
speech classification. This model surpasses the performance shown by the
current best models to establish itself as the state-of-the-art in the
unexplored domain of Hinglish offensive text classification.We also release our
model and the embeddings trained for research purpose
IceBreaker: Solving Cold Start Problem for Video Recommendation Engines
Internet has brought about a tremendous increase in content of all forms and,
in that, video content constitutes the major backbone of the total content
being published as well as watched. Thus it becomes imperative for video
recommendation engines such as Hulu to look for novel and innovative ways to
recommend the newly added videos to their users. However, the problem with new
videos is that they lack any sort of metadata and user interaction so as to be
able to rate the videos for the consumers. To this effect, this paper
introduces the several techniques we develop for the Content Based Video
Relevance Prediction (CBVRP) Challenge being hosted by Hulu for the ACM
Multimedia Conference 2018. We employ different architectures on the CBVRP
dataset to make use of the provided frame and video level features and generate
predictions of videos that are similar to the other videos. We also implement
several ensemble strategies to explore complementarity between both the types
of provided features. The obtained results are encouraging and will impel the
boundaries of research for multimedia based video recommendation systems
GANTouch: An Attack-Resilient Framework for Touch-based Continuous Authentication System
Previous studies have shown that commonly studied (vanilla) implementations
of touch-based continuous authentication systems (V-TCAS) are susceptible to
active adversarial attempts. This study presents a novel Generative Adversarial
Network assisted TCAS (G-TCAS) framework and compares it to the V-TCAS under
three active adversarial environments viz. Zero-effort, Population, and
Random-vector. The Zero-effort environment was implemented in two variations
viz. Zero-effort (same-dataset) and Zero-effort (cross-dataset). The first
involved a Zero-effort attack from the same dataset, while the second used
three different datasets. G-TCAS showed more resilience than V-TCAS under the
Population and Random-vector, the more damaging adversarial scenarios than the
Zero-effort. On average, the increase in the false accept rates (FARs) for
V-TCAS was much higher (27.5% and 21.5%) than for G-TCAS (14% and 12.5%) for
Population and Random-vector attacks, respectively. Moreover, we performed a
fairness analysis of TCAS for different genders and found TCAS to be fair
across genders. The findings suggest that we should evaluate TCAS under active
adversarial environments and affirm the usefulness of GANs in the TCAS
pipeline.Comment: 11 pages, 7 figures, 2 tables, 3 algorithms, in IEEE TBIOM 202
- …