8 research outputs found
Disturbance Grassmann Kernels for Subspace-Based Learning
In this paper, we focus on subspace-based learning problems, where data
elements are linear subspaces instead of vectors. To handle this kind of data,
Grassmann kernels were proposed to measure the space structure and used with
classifiers, e.g., Support Vector Machines (SVMs). However, the existing
discriminative algorithms mostly ignore the instability of subspaces, which
would cause the classifiers misled by disturbed instances. Thus we propose
considering all potential disturbance of subspaces in learning processes to
obtain more robust classifiers. Firstly, we derive the dual optimization of
linear classifiers with disturbance subject to a known distribution, resulting
in a new kernel, Disturbance Grassmann (DG) kernel. Secondly, we research into
two kinds of disturbance, relevant to the subspace matrix and singular values
of bases, with which we extend the Projection kernel on Grassmann manifolds to
two new kernels. Experiments on action data indicate that the proposed kernels
perform better compared to state-of-the-art subspace-based methods, even in a
worse environment.Comment: This paper include 3 figures, 10 pages, and has been accpeted to
SIGKDD'1
Embedding Uncertain Knowledge Graphs
Embedding models for deterministic Knowledge Graphs (KG) have been
extensively studied, with the purpose of capturing latent semantic relations
between entities and incorporating the structured knowledge into machine
learning. However, there are many KGs that model uncertain knowledge, which
typically model the inherent uncertainty of relations facts with a confidence
score, and embedding such uncertain knowledge represents an unresolved
challenge. The capturing of uncertain knowledge will benefit many
knowledge-driven applications such as question answering and semantic search by
providing more natural characterization of the knowledge. In this paper, we
propose a novel uncertain KG embedding model UKGE, which aims to preserve both
structural and uncertainty information of relation facts in the embedding
space. Unlike previous models that characterize relation facts with binary
classification techniques, UKGE learns embeddings according to the confidence
scores of uncertain relation facts. To further enhance the precision of UKGE,
we also introduce probabilistic soft logic to infer confidence scores for
unseen relation facts during training. We propose and evaluate two variants of
UKGE based on different learning objectives. Experiments are conducted on three
real-world uncertain KGs via three tasks, i.e. confidence prediction, relation
fact ranking, and relation fact classification. UKGE shows effectiveness in
capturing uncertain knowledge by achieving promising results on these tasks,
and consistently outperforms baselines on these tasks
Explainable Misinformation Detection Across Multiple Social Media Platforms
In this work, the integration of two machine learning approaches, namely
domain adaptation and explainable AI, is proposed to address these two issues
of generalized detection and explainability. Firstly the Domain Adversarial
Neural Network (DANN) develops a generalized misinformation detector across
multiple social media platforms DANN is employed to generate the classification
results for test domains with relevant but unseen data. The DANN-based model, a
traditional black-box model, cannot justify its outcome, i.e., the labels for
the target domain. Hence a Local Interpretable Model-Agnostic Explanations
(LIME) explainable AI model is applied to explain the outcome of the DANN mode.
To demonstrate these two approaches and their integration for effective
explainable generalized detection, COVID-19 misinformation is considered a case
study. We experimented with two datasets, namely CoAID and MiSoVac, and
compared results with and without DANN implementation. DANN significantly
improves the accuracy measure F1 classification score and increases the
accuracy and AUC performance. The results obtained show that the proposed
framework performs well in the case of domain shift and can learn
domain-invariant features while explaining the target labels with LIME
implementation enabling trustworthy information processing and extraction to
combat misinformation effectively.Comment: 28 pages,4 figure
An enhanced gated recurrent unit with auto-encoder for solving text classification problems
Classification has become an important task for categorizing documents
automatically based on their respective groups. Gated Recurrent Unit (GRU) is a
type of Recurrent Neural Networks (RNNs), and a deep learning algorithm that
contains update gate and reset gate. It is considered as one of the most efficient text
classification techniques, specifically on sequential datasets. However, GRU suffered
from three major issues when it is applied for solving the text classification
problems. The first drawback is the failure in data dimensionality reduction, which
leads to low quality solution for the classification problems. Secondly, GRU still has
difficulty in training procedure due to redundancy between update and reset gates.
The reset gate creates complexity and require high processing time. Thirdly, GRU
also has a problem with informative features loss in each recurrence during the
training phase and high computational cost. The reason behind this failure is due to a
random selection of features from datasets (or previous outputs), when applied in its
standard form. Therefore, in this research, a new model namely Encoder Simplified
GRU (ES-GRU) is proposed to reduce dimension of data using an Auto-Encoder
(AE). Accordingly, the reset gate is replaced with an update gate in order to reduce
the redundancy and complexity in the standard GRU. Finally, a Batch Normalization
method is incorporated in the GRU and AE for improving the performance of the
proposed ES-GRU model. The proposed model has been evaluated on seven
benchmark text datasets and compared with six baselines well-known multiclass text
classification approaches included standard GRU, AE, Long Short Term Memory,
Convolutional Neural Network, Support Vector Machine, and Naïve Bayes. Based
on various types of performance evaluation parameters, a considerable amount of
improvement has been observed in the performance of the proposed model as
compared to other standard classification techniques, and showed better effectiveness
and efficiency of the developed model
Domain adaptation in Natural Language Processing
Domain adaptation has received much attention in the past decade. It has been shown that domain knowledge is paramount for building successful Natural Language Processing (NLP) applications.
To investigate the domain adaptation problem, we conduct several experiments from different perspectives. First, we automatically adapt sentiment dictionaries for predicting the financial outcomes “excess return” and “volatility”. In these experiments, we compare manual adaptation of the domain-general dictionary with automatic adaptation, and manual adaptation with a combination consisting of first manual, then automatic adaptation. We demonstrate that automatic adaptation performs better than manual adaptation, namely the automatically adapted sentiment dictionary outperforms the previous state of the art in predicting excess return and volatility. Furthermore, we perform qualitative and quantitative analyses finding that annotation based on an expert’s a priori belief about a word’s meaning is error-prone – the meaning of a word can only be recognized in the context that it appears in.
Second, we develop the temporal transfer learning approach to account for the language change in social media. The language of social media is changing rapidly – new words appear in the vocabulary, and new trends are constantly emerging. Temporal transfer-learning allows us to model these temporal dynamics in the document collection. We show that this method significantly improves the prediction of movie sales from discussions on social media forums. In particular, we illustrate the success of parameter transfer, the importance of textual information for financial prediction, and show that temporal transfer learning can capture temporal trends in the data by focusing on those features that are relevant in a particular time step, i.e., we obtain more robust models preventing overfitting.
Third, we compare the performance of various domain adaptation models in low-resource settings, i.e., when there is a lack of large amounts of high-quality training data. This is an important issue in computational linguistics since the success of NLP applications primarily depends on the availability of training data. In real-world scenarios, the data is often too restricted and specialized. In our experiments, we evaluate different domain adaptation methods under these assumptions and find the most appropriate techniques for such a low-data problem. Furthermore, we discuss the conditions under which one approach substantially outperforms the other.
Finally, we summarize our work on domain adaptation in NLP and discuss possible future work topics.Die Domänenanpassung hat in den letzten zehn Jahren viel Aufmerksamkeit erhalten. Es hat sich gezeigt, dass das Domänenwissen für die Erstellung erfolgreicher NLP-Anwendungen (Natural Language Processing) von größter Bedeutung ist.
Um das Problem der Domänenanpassung zu untersuchen, führen wir mehrere Experimente aus verschiedenen Perspektiven durch. Erstens passen wir Sentimentlexika automatisch an, um die Überschussrendite und die Volatilität der Finanzergebnisse besser vorherzusagen. In diesen Experimenten vergleichen wir die manuelle Anpassung des allgemeinen Lexikons mit der automatischen Anpassung und die manuelle Anpassung mit einer Kombination aus erst manueller und dann automatischer Anpassung. Wir zeigen, dass die automatische Anpassung eine bessere Leistung erbringt als die manuelle Anpassung: das automatisch angepasste Sentimentlexikon übertrifft den bisherigen Stand der Technik bei der Vorhersage der Überschussrendite und der Volatilität. Darüber hinaus führen wir eine qualitative und quantitative Analyse durch und stellen fest, dass Annotationen, die auf der a priori Überzeugung eines Experten über die Bedeutung eines Wortes basieren, fehlerhaft sein können. Die Bedeutung eines Wortes kann nur in dem Kontext erkannt werden, in dem es erscheint.
Zweitens entwickeln wir den Ansatz, den wir Temporal Transfer Learning benennen, um den Sprachwechsel in sozialen Medien zu berücksichtigen. Die Sprache der sozialen Medien ändert sich rasant – neue Wörter erscheinen im Vokabular und es entstehen ständig neue Trends. Temporal Transfer Learning ermöglicht es, diese zeitliche Dynamik in der Dokumentensammlung zu modellieren. Wir zeigen, dass diese Methode die Vorhersage von Filmverkäufen aus Diskussionen in Social-Media-Foren erheblich verbessert. In unseren Experimenten zeigen wir (i) den Erfolg der Parameterübertragung, (ii) die Bedeutung von Textinformationen für die finanzielle Vorhersage und (iii) dass Temporal Transfer Learning zeitliche Trends in den Daten erfassen kann, indem es sich auf die Merkmale konzentriert, die in einem bestimmten Zeitschritt relevant sind, d. h. wir erhalten robustere Modelle, die eine Überanpassung verhindern.
Drittens vergleichen wir die Leistung verschiedener Domänenanpassungsmodelle in ressourcenarmen Umgebungen, d. h. wenn große Mengen an hochwertigen Trainingsdaten fehlen. Das ist ein wichtiges Thema in der Computerlinguistik, da der Erfolg der NLP-Anwendungen stark von der Verfügbarkeit von Trainingsdaten abhängt. In realen Szenarien sind die Daten oft zu eingeschränkt und spezialisiert. In unseren Experimenten evaluieren wir verschiedene Domänenanpassungsmethoden unter diesen Annahmen und finden die am besten geeigneten Techniken dafür. Darüber hinaus diskutieren wir die Bedingungen, unter denen ein Ansatz den anderen deutlich übertrifft.
Abschließend fassen wir unsere Arbeit zur Domänenanpassung in NLP zusammen und diskutieren mögliche zukünftige Arbeitsthemen