874 research outputs found
MEGAN: Multi-Explanation Graph Attention Network
Explainable artificial intelligence (XAI) methods are expected to improve
trust during human-AI interactions, provide tools for model analysis and extend
human understanding of complex problems. Explanation-supervised training allows
to improve explanation quality by training self-explaining XAI models on ground
truth or human-generated explanations. However, existing explanation methods
have limited expressiveness and interoperability due to the fact that only
single explanations in form of node and edge importance are generated. To that
end we propose the novel multi-explanation graph attention network (MEGAN). Our
fully differentiable, attention-based model features multiple explanation
channels, which can be chosen independently of the task specifications. We
first validate our model on a synthetic graph regression dataset. We show that
for the special single explanation case, our model significantly outperforms
existing post-hoc and explanation-supervised baseline methods. Furthermore, we
demonstrate significant advantages when using two explanations, both in
quantitative explanation measures as well as in human interpretability.
Finally, we demonstrate our model's capabilities on multiple real-world
datasets. We find that our model produces sparse high-fidelity explanations
consistent with human intuition about those tasks and at the same time matches
state-of-the-art graph neural networks in predictive performance, indicating
that explanations and accuracy are not necessarily a trade-off.Comment: 9 pages main text, 29 pages total, 19 figure
New Development of Neutrosophic Probability, Neutrosophic Statistics, Neutrosophic Algebraic Structures, and Neutrosophic & Plithogenic Optimizations
This Special Issue puts forward for discussion state-of-the-art papers on new topics related to neutrosophic theories, such as neutrosophic algebraic structures, neutrosophic triplet algebraic structures, neutrosophic extended triplet algebraic structures, neutrosophic algebraic hyperstructures, neutrosophic triplet algebraic hyperstructures, neutrosophic n-ary algebraic structures, neutrosophic n-ary algebraic hyperstructures, refined neutrosophic algebraic structures, refined neutrosophic algebraic hyperstructures, quadruple neutrosophic algebraic structures, refined quadruple neutrosophic algebraic structures, neutrosophic image processing, neutrosophic image classification, neutrosophic computer vision, neutrosophic machine learning, neutrosophic artificial intelligence, neutrosophic data analytics, neutrosophic deep learning, neutrosophic symmetry, and their applications in the real world. This book leads to the further advancement of the neutrosophic and plithogenic theories of NeutroAlgebra and AntiAlgebra, NeutroGeometry and AntiGeometry, Neutrosophic n-SuperHyperGraph (the most general form of graph of today), Neutrosophic Statistics, Plithogenic Logic as a generalization of MultiVariate Logic, Plithogenic Probability and Plithogenic Statistics as a generalization of MultiVariate Probability and Statistics, respectively, and presents their countless applications in our every-day world
Apprentissage des réseaux de neurones profonds et applications en traitement automatique de la langue naturelle
En apprentissage automatique, domaine qui consiste à utiliser des données pour apprendre une solution aux problèmes que nous voulons confier à la machine, le modèle des Réseaux de Neurones Artificiels (ANN) est un outil précieux. Il a été inventé voilà maintenant près de soixante ans, et pourtant, il est encore de nos jours le sujet d'une recherche active. Récemment, avec l'apprentissage profond, il a en effet permis d'améliorer l'état de l'art dans de nombreux champs d'applications comme la vision par ordinateur, le traitement de la parole et le traitement des langues naturelles.
La quantité toujours grandissante de données disponibles et les améliorations du matériel informatique ont permis de faciliter l'apprentissage de modèles à haute capacité comme les ANNs profonds. Cependant, des difficultés inhérentes à l'entraînement de tels modèles, comme les minima locaux, ont encore un impact important. L'apprentissage profond vise donc à trouver des solutions, en régularisant ou en facilitant l'optimisation. Le pré-entraînnement non-supervisé, ou la technique du ``Dropout'', en sont des exemples.
Les deux premiers travaux présentés dans cette thèse suivent cette ligne de recherche. Le premier étudie les problèmes de gradients diminuants/explosants dans les architectures profondes. Il montre que des choix simples, comme la fonction d'activation ou l'initialisation des poids du réseaux, ont une grande influence. Nous proposons l'initialisation normalisée pour faciliter l'apprentissage. Le second se focalise sur le choix
de la fonction d'activation et présente le rectifieur, ou unité rectificatrice linéaire. Cette étude a été la première à mettre l'accent sur les fonctions d'activations linéaires par morceaux pour les réseaux de neurones profonds en apprentissage supervisé. Aujourd'hui, ce type de fonction d'activation est une composante essentielle des réseaux de neurones profonds.
Les deux derniers travaux présentés se concentrent sur les applications des ANNs en traitement des langues naturelles. Le premier aborde le sujet de l'adaptation de domaine pour l'analyse de sentiment, en utilisant des Auto-Encodeurs Débruitants. Celui-ci est encore l'état de l'art de nos jours. Le second traite de l'apprentissage de données multi-relationnelles avec un modèle à base d'énergie, pouvant être utilisé pour la tâche
de désambiguation de sens.Machine learning aims to leverage data in order for computers to solve problems of interest. Despite being invented close to sixty years ago, Artificial Neural Networks (ANN) remain an area of active research and a powerful tool. Their resurgence in the context of deep learning has led to dramatic improvements in various domains from computer vision and speech processing to natural language processing.
The quantity of available data and the computing power are always increasing, which is desirable to train high capacity models such as deep ANNs. However, some intrinsic learning difficulties, such as local minima, remain problematic. Deep learning aims to find solutions to these problems, either by adding some regularisation or improving optimisation. Unsupervised pre-training or Dropout are examples of such solutions.
The two first articles presented in this thesis follow this line of research. The first analyzes the problem of vanishing/exploding gradients in deep architectures. It shows that simple choices, like the activation function or the weights initialization, can have an important impact. We propose the normalized initialization scheme to improve learning. The second focuses on the activation function, where we propose the rectified linear unit. This work was the first to emphasise the use of linear by parts activation functions for deep supervised neural networks, which is now an essential component of such models.
The last two papers show some applications of ANNs to Natural Language Processing. The first focuses on the specific subject of domain adaptation in the context of sentiment analysis, using Stacked Denoising Auto-encoders. It remains state of the art to this day. The second tackles learning with multi-relational data using an energy based model which can also be applied to the task of word-sense disambiguation
Domain knowledge, uncertainty, and parameter constraints
Ph.D.Committee Chair: Guy Lebanon; Committee Member: Alex Shapiro; Committee Member: Alexander Gray; Committee Member: Chin-Hui Lee; Committee Member: Hongyuan Zh
Will Sentiment Analysis Need Subculture? A New Data Augmentation Approach
The renowned proverb that "The pen is mightier than the sword" underscores
the formidable influence wielded by text expressions in shaping sentiments.
Indeed, well-crafted written can deeply resonate within cultures, conveying
profound sentiments. Nowadays, the omnipresence of the Internet has fostered a
subculture that congregates around the contemporary milieu. The subculture
artfully articulates the intricacies of human feelings by ardently pursuing the
allure of novelty, a fact that cannot be disregarded in the sentiment analysis.
This paper strives to enrich data through the lens of subculture, to address
the insufficient training data faced by sentiment analysis. To this end, a new
approach of subculture-based data augmentation (SCDA) is proposed, which
engenders six enhanced texts for each training text by leveraging the creation
of six diverse subculture expression generators. The extensive experiments
attest to the effectiveness and potential of SCDA. The results also shed light
on the phenomenon that disparate subculture expressions elicit varying degrees
of sentiment stimulation. Moreover, an intriguing conjecture arises, suggesting
the linear reversibility of certain subculture expressions. It is our fervent
aspiration that this study serves as a catalyst in fostering heightened
perceptiveness towards the tapestry of information, sentiment and culture,
thereby enriching our collective understanding.Comment: JASIS
IEEE Access Special Section Editorial: Big Data Technology and Applications in Intelligent Transportation
During the last few years, information technology and transportation industries, along with automotive manufacturers and academia, are focusing on leveraging intelligent transportation systems (ITS) to improve services related to driver experience, connected cars, Internet data plans for vehicles, traffic infrastructure, urban transportation systems, traffic collaborative management, road traffic accidents analysis, road traffic flow prediction, public transportation service plan, personal travel route plans, and the development of an effective ecosystem for vehicles, drivers, traffic controllers, city planners, and transportation applications. Moreover, the emerging technologies of the Internet of Things (IoT) and cloud computing have provided unprecedented opportunities for the development and realization of innovative intelligent transportation systems where sensors and mobile devices can gather information and cloud computing, allowing knowledge discovery, information sharing, and supported decision making. However, the development of such data-driven ITS requires the integration, processing, and analysis of plentiful information obtained from millions of vehicles, traffic infrastructures, smartphones, and other collaborative systems like weather stations and road safety and early warning systems. The huge amount of data generated by ITS devices is only of value if utilized in data analytics for decision-making such as accident prevention and detection, controlling road risks, reducing traffic carbon emissions, and other applications which bring big data analytics into the picture
Cognitive satellite communications and representation learning for streaming and complex graphs.
This dissertation includes two topics. The first topic studies a promising dynamic spectrum access algorithm (DSA) that improves the throughput of satellite communication (SATCOM) under the uncertainty. The other topic investigates distributed representation learning for streaming and complex networks. DSA allows a secondary user to access the spectrum that are not occupied by primary users. However, uncertainty in SATCOM causes more spectrum sensing errors. In this dissertation, the uncertainty has been addressed by formulating a DSA decision-making process as a Partially Observable Markov Decision Process (POMDP) model to optimally determine which channels to sense and access. Large-scale networks have attracted many attentions to discover the hidden information from big data. Particularly, representation learning embeds the network into a lower vector space while maximally preserving the similarity among nodes. I propose a real-time distributed graph embedding algorithm (RTDGE) which is capable of distributively embedding the streaming graph by combining a novel edge partition approach and an incremental negative sample approach. Furthermore, a platform is prototyped based on Kafka and Storm. Real-time Twitter network data can be retrieved, partitioned and processed for state-of-art tasks. For knowledge graphs, existing works cannot capture the complex connection patterns and never consider the impacts from complicated relations, due to the unquantifiable relationships. A novel embedding algorithm is proposed to hierarchically measure the structural similarity and the impacts from relations by constructing a multi-layer graph. Then, an advanced representation learning model is designed based on an entity\u27s context generated by random walks on the multi-layer content graph
- …