20 research outputs found
Representation Learning for Texts and Graphs: A Unified Perspective on Efficiency, Multimodality, and Adaptability
[...] This thesis is situated between natural language processing and graph representation learning and investigates selected connections. First, we introduce matrix embeddings as an efficient text representation sensitive to word order. [...] Experiments with ten linguistic probing tasks, 11 supervised, and five unsupervised downstream tasks reveal that vector and matrix embeddings have complementary strengths and that a jointly trained hybrid model outperforms both. Second, a popular pretrained language model, BERT, is distilled into matrix embeddings. [...] The results on the GLUE benchmark show that these models are competitive with other recent contextualized language models while being more efficient in time and space. Third, we compare three model types for text classification: bag-of-words, sequence-, and graph-based models. Experiments on five datasets show that, surprisingly, a wide multilayer perceptron on top of a bag-of-words representation is competitive with recent graph-based approaches, questioning the necessity of graphs synthesized from the text. [...] Fourth, we investigate the connection between text and graph data in document-based recommender systems for citations and subject labels. Experiments on six datasets show that the title as side information improves the performance of autoencoder models. [...] We find that the meaning of item co-occurrence is crucial for the choice of input modalities and an appropriate model. Fifth, we introduce a generic framework for lifelong learning on evolving graphs in which new nodes, edges, and classes appear over time. [...] The results show that by reusing previous parameters in incremental training, it is possible to employ smaller history sizes with only a slight decrease in accuracy compared to training with complete history. Moreover, weighting the binary cross-entropy loss function is crucial to mitigate the problem of class imbalance when detecting newly emerging classes. [...
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
A Recipe for Well-behaved Graph Neural Approximations of Complex Dynamics
Data-driven approximations of ordinary differential equations offer a
promising alternative to classical methods in discovering a dynamical system
model, particularly in complex systems lacking explicit first principles. This
paper focuses on a complex system whose dynamics is described with a system of
ordinary differential equations, coupled via a network adjacency matrix.
Numerous real-world systems, including financial, social, and neural systems,
belong to this class of dynamical models. We propose essential elements for
approximating such dynamical systems using neural networks, including necessary
biases and an appropriate neural architecture. Emphasizing the differences from
static supervised learning, we advocate for evaluating generalization beyond
classical assumptions of statistical learning theory. To estimate confidence in
prediction during inference time, we introduce a dedicated null model. By
studying various complex network dynamics, we demonstrate the neural network's
ability to approximate various dynamics, generalize across complex network
structures, sizes, and statistical properties of inputs. Our comprehensive
framework enables deep learning approximations of high-dimensional,
non-linearly coupled complex dynamical systems
Towards Subject Agnostic Affective Emotion Recognition
This paper focuses on affective emotion recognition, aiming to perform in the
subject-agnostic paradigm based on EEG signals. However, EEG signals manifest
subject instability in subject-agnostic affective Brain-computer interfaces
(aBCIs), which led to the problem of distributional shift. Furthermore, this
problem is alleviated by approaches such as domain generalisation and domain
adaptation. Typically, methods based on domain adaptation confer comparatively
better results than the domain generalisation methods but demand more
computational resources given new subjects. We propose a novel framework,
meta-learning based augmented domain adaptation for subject-agnostic aBCIs. Our
domain adaptation approach is augmented through meta-learning, which consists
of a recurrent neural network, a classifier, and a distributional shift
controller based on a sum-decomposable function. Also, we present that a neural
network explicating a sum-decomposable function can effectively estimate the
divergence between varied domains. The network setting for augmented domain
adaptation follows meta-learning and adversarial learning, where the controller
promptly adapts to new domains employing the target data via a few
self-adaptation steps in the test phase. Our proposed approach is shown to be
effective in experiments on a public aBICs dataset and achieves similar
performance to state-of-the-art domain adaptation methods while avoiding the
use of additional computational resources.Comment: To Appear in MUWS workshop at the 32nd ACM International Conference
on Information and Knowledge Management (CIKM) 202
Domain Generalization in Machine Learning Models for Wireless Communications: Concepts, State-of-the-Art, and Open Issues
Data-driven machine learning (ML) is promoted as one potential technology to
be used in next-generations wireless systems. This led to a large body of
research work that applies ML techniques to solve problems in different layers
of the wireless transmission link. However, most of these applications rely on
supervised learning which assumes that the source (training) and target (test)
data are independent and identically distributed (i.i.d). This assumption is
often violated in the real world due to domain or distribution shifts between
the source and the target data. Thus, it is important to ensure that these
algorithms generalize to out-of-distribution (OOD) data. In this context,
domain generalization (DG) tackles the OOD-related issues by learning models on
different and distinct source domains/datasets with generalization capabilities
to unseen new domains without additional finetuning. Motivated by the
importance of DG requirements for wireless applications, we present a
comprehensive overview of the recent developments in DG and the different
sources of domain shift. We also summarize the existing DG methods and review
their applications in selected wireless communication problems, and conclude
with insights and open questions
Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering
Differential privacy is a widely accepted measure of privacy in the context
of deep learning algorithms, and achieving it relies on a noisy training
approach known as differentially private stochastic gradient descent (DP-SGD).
DP-SGD requires direct noise addition to every gradient in a dense neural
network, the privacy is achieved at a significant utility cost. In this work,
we present Spectral-DP, a new differentially private learning approach which
combines gradient perturbation in the spectral domain with spectral filtering
to achieve a desired privacy guarantee with a lower noise scale and thus better
utility. We develop differentially private deep learning methods based on
Spectral-DP for architectures that contain both convolution and fully connected
layers. In particular, for fully connected layers, we combine a block-circulant
based spatial restructuring with Spectral-DP to achieve better utility. Through
comprehensive experiments, we study and provide guidelines to implement
Spectral-DP deep learning on benchmark datasets. In comparison with
state-of-the-art DP-SGD based approaches, Spectral-DP is shown to have
uniformly better utility performance in both training from scratch and transfer
learning settings.Comment: Accepted in 2023 IEEE Symposium on Security and Privacy (SP
Benchmarking Robustness of Adaptation Methods on Pre-trained Vision-Language Models
Various adaptation methods, such as LoRA, prompts, and adapters, have been
proposed to enhance the performance of pre-trained vision-language models in
specific domains. The robustness of these adaptation methods against
distribution shifts have not been studied. In this study, we assess the
robustness of 11 widely-used adaptation methods across 4 vision-language
datasets under multimodal corruptions. Concretely, we introduce 7 benchmark
datasets, including 96 visual and 87 textual corruptions, to investigate the
robustness of different adaptation methods, the impact of available adaptation
examples, and the influence of trainable parameter size during adaptation. Our
analysis reveals that: 1) Adaptation methods are more sensitive to text
corruptions than visual corruptions. 2) Full fine-tuning does not consistently
provide the highest robustness; instead, adapters can achieve better robustness
with comparable clean performance. 3) Contrary to expectations, our findings
indicate that increasing the number of adaptation data and parameters does not
guarantee enhanced robustness; instead it results in even lower robustness. We
hope this study could benefit future research in the development of robust
multimodal adaptation methods. The benchmark, code, and dataset used in this
study can be accessed at \url{https://adarobustness.github.io}
RESEARCH ON IIOT SECURITY: NOVEL MACHINE LEARNING-BASED INTRUSION DETECTION USING TCP/IP PACKETS
The Industrial Internet of Things (IIoT) explosive expansion has raised questions regarding the safety of industrial systems. Networks like these are crucially protected from a variety of cyber threats by intrusion detection systems (IDSs). In order to detect intrusions in the IIoT environment utilizing TCP/IP packets, this work introduces a novel Hybrid Deep Convolutional Autoencoder and Splinted Decision Tree (HDCA-SDT) technique. High-level features are extracted from the unprocessed TCP/IP packet data using the DCA. The retrieved features are then classified using the SDT algorithm into various intrusion categories. In order to enable quicker decision-making yet preserve accurate results, the SDT technique effectively divides the feature space. The NSL-KDD dataset is used to train and assess the model. The efficiency of the suggested hybrid strategy is shown by experimental findings. Comparing the proposed hybrid approach to conventional intrusion detection methods, it acquired higher detection accuracy. The model also demonstrates robustness to fluctuations in traffic on the network and possesses the ability to identify known and unidentified intrusions with high recall rates