376 research outputs found
A generative adversarial network for single and multi-hop distributional knowledge base completion
Knowledge bases (KBs) inherently lack reasoning ability, limiting their effectiveness for tasks such as question-answering and query expansion. Machine-learning is hence commonly employed for representation learning in order to learn semantic features useful for generalization. Most existing methods utilize discriminative models that require both positive and negative samples to learn a decision boundary. KBs, by contrast, contain only positive samples, necessitating that negative samples are generated by replacing the head/tail of predicates with randomly-chosen entities. They are thus frequently easily discriminable from positive samples, which can prevent learning of sufficiently robust classifiers. Generative models, however, do not require negative samples to learn the distribution of positive samples; stimulated by recent developments in Generative Adversarial Networks (GANs), we propose a novel framework, Knowledge Completion GANs (KCGANs), for competitively training generative link prediction models against discriminative belief prediction models. KCGAN thus invokes a game between generator-network G and discriminator-networkD in which G aims to understand underlying KB structure by learning to perform link prediction while D tries to gain knowledge about the KB by learning predicate/triplet classification. Two key challenges are addressed: 1) Classical GAN architectures’ inability to easily generate samples over discrete entities; 2) the inefficiency of softmax for learning distributions over large sets of entities. As a step toward full first-order logical reasoning we further extend KCGAN to learn multi-hop logical entailment relations between entities by enabling G to compose a multi-hop relational path between entities and D to discriminate between real and fake paths.
KCGAN is tested on benchmarks WordNet and FreeBase datasets and evaluated on link prediction and belief prediction tasks using MRR and HIT@10, achieving best-in-class performance
A Survey of Document-Level Information Extraction
Document-level information extraction (IE) is a crucial task in natural
language processing (NLP). This paper conducts a systematic review of recent
document-level IE literature. In addition, we conduct a thorough error analysis
with current state-of-the-art algorithms and identify their limitations as well
as the remaining challenges for the task of document-level IE. According to our
findings, labeling noises, entity coreference resolution, and lack of
reasoning, severely affect the performance of document-level IE. The objective
of this survey paper is to provide more insights and help NLP researchers to
further enhance document-level IE performance
Foraging tactics and social networks in wild jackdaws
Individual variation in asocial and social behavioural traits can affect patterns of social association. Resultant individual-level variation in sociality can be quantified using social network analysis. Social network analysis has recently been applied to the study of the evolution and development of social behaviour. Though captive systems have provided useful contributions to this endeavour, investigating the factors shaping social structure in wild populations affords superior ecological relevance. The characterisation of the social structure of wild animals has been greatly aided by improvements in automated data collection methods, particularly the miniaturisation of Radio-Frequency Identification (RFID) technology for the purposes of studying the social foraging behaviour of wild birds. In this thesis, I use RFID methods to examine the factors influencing between-individual variation in foraging routines (Chapter Two) and social network position (Chapter Three) in wild populations of a colonial corvid species, the jackdaw (Corvus monedula). I then relate social network position to reproductive success (Chapter Three) and investigate the developmental plasticity of jackdaw social behaviour by determining the effect of early life conditions on social network position (Chapter Four). Finally, I describe the fine-scale temporal dynamics of social foraging, the nature of accompaniment during paired foraging and the foraging benefits of social support (Chapter Five)
Open-source resources and standards for Arabic word structure analysis: Fine grained morphological analysis of Arabic text corpora
Morphological analyzers are preprocessors for text analysis. Many Text Analytics applications need them to perform their tasks. The aim of this thesis is to develop
standards, tools and resources that widen the scope of Arabic word structure analysis - particularly morphological analysis, to process Arabic text corpora of different domains, formats and genres, of both vowelized and non-vowelized text.
We want to morphologically tag our Arabic Corpus, but evaluation of existing morphological analyzers has highlighted shortcomings and shown that more research is
required. Tag-assignment is significantly more complex for Arabic than for many languages. The morphological analyzer should add the appropriate linguistic information
to each part or morpheme of the word (proclitic, prefix, stem, suffix and enclitic); in effect, instead of a tag for a word, we need a subtag for each part.
Very fine-grained distinctions may cause problems for automatic morphosyntactic analysis – particularly probabilistic taggers which require training data, if some words can change grammatical tag depending on function and context; on the other hand, finegrained distinctions may actually help to disambiguate other words in the local context. The SALMA – Tagger is a fine grained morphological analyzer which is mainly depends on linguistic information extracted from traditional Arabic grammar books and prior knowledge broad-coverage lexical resources; the SALMA – ABCLexicon.
More fine-grained tag sets may be more appropriate for some tasks. The SALMA –Tag Set is a theory standard for encoding, which captures long-established traditional
fine-grained morphological features of Arabic, in a notation format intended to be compact yet transparent.
The SALMA – Tagger has been used to lemmatize the 176-million words Arabic Internet Corpus. It has been proposed as a language-engineering toolkit for Arabic lexicography and for phonetically annotating the Qur’an by syllable and primary stress information, as well as, fine-grained morphological tagging
A generative adversarial network for single and multi-hop distributional knowledge base completion
Knowledge bases (KBs) inherently lack reasoning ability, limiting their effectiveness for tasks such as question-answering and query expansion. Machine-learning is hence commonly employed for representation learning in order to learn semantic features useful for generalization. Most existing methods utilize discriminative models that require both positive and negative samples to learn a decision boundary. KBs, by contrast, contain only positive samples, necessitating that negative samples are generated by replacing the head/tail of predicates with randomly-chosen entities. They are thus frequently easily discriminable from positive samples, which can prevent learning of sufficiently robust classifiers. Generative models, however, do not require negative samples to learn the distribution of positive samples; stimulated by recent developments in Generative Adversarial Networks (GANs), we propose a novel framework, Knowledge Completion GANs (KCGANs), for competitively training generative link prediction models against discriminative belief prediction models. KCGAN thus invokes a game between generator-network G and discriminator-networkD in which G aims to understand underlying KB structure by learning to perform link prediction while D tries to gain knowledge about the KB by learning predicate/triplet classification. Two key challenges are addressed: 1) Classical GAN architectures’ inability to easily generate samples over discrete entities; 2) the inefficiency of softmax for learning distributions over large sets of entities. As a step toward full first-order logical reasoning we further extend KCGAN to learn multi-hop logical entailment relations between entities by enabling G to compose a multi-hop relational path between entities and D to discriminate between real and fake paths.
KCGAN is tested on benchmarks WordNet and FreeBase datasets and evaluated on link prediction and belief prediction tasks using MRR and HIT@10, achieving best-in-class performance
G-CREWE: Graph CompREssion With Embedding for Network Alignment
Network alignment is useful for multiple applications that require
increasingly large graphs to be processed. Existing research approaches this as
an optimization problem or computes the similarity based on node
representations. However, the process of aligning every pair of nodes between
relatively large networks is time-consuming and resource-intensive. In this
paper, we propose a framework, called G-CREWE (Graph CompREssion With
Embedding) to solve the network alignment problem. G-CREWE uses node embeddings
to align the networks on two levels of resolution, a fine resolution given by
the original network and a coarse resolution given by a compressed version, to
achieve an efficient and effective network alignment. The framework first
extracts node features and learns the node embedding via a Graph Convolutional
Network (GCN). Then, node embedding helps to guide the process of graph
compression and finally improve the alignment performance. As part of G-CREWE,
we also propose a new compression mechanism called MERGE (Minimum dEgRee
neiGhbors comprEssion) to reduce the size of the input networks while
preserving the consistency in their topological structure. Experiments on all
real networks show that our method is more than twice as fast as the most
competitive existing methods while maintaining high accuracy.Comment: 10 pages, accepted at the 29th ACM International Conference
onInformation and Knowledge Management (CIKM 20
Insight into social physics: uncovering the structure and dynamics of social relationships
This thesis investigates the emerging interdisciplinary field of social physics, which applies
concepts and methods from physics, mathematics and anthropology to understand human
behaviour in social systems. Our research seeks to elucidate how humans organise their
social relationships and how they evolve over time by examining the universal principles
underpinning these phenomena. The basis of our investigation is the concept of “social atom”,
which serves as a foundation for studying ego-networks at the micro-level and exploring the
collective behaviour of social systems at the macro-level.
We embark on two complementary research approaches to address this complex problem.
Our first approach involves conducting field research by surveying high school students about
their friendships and enmities over two academic years. This empirical data enables us to
analyse the organisation and evolution of social relationships, providing valuable insights that
can be shared with school principals to foster a more positive social atmosphere and prevent
important issues such as bullying.
Our second approach aligns with the conventional scientific method. It involves the
formulation of hypotheses, the development of network models and their testing. To do
that, we employ exponential random graph models and density functional theory, a technique
originating from statistical mechanics for analysing lattice gases. This approach demonstrates
that social networks can exhibit phenomena comparable to those observed in fluids or gases,
such as phase transitions. These findings contribute to a more profound understanding of
the behaviour exhibited by social systems.
Moreover, we expand the applicability of these models to include other species, such as
primates, demonstrating their relevance beyond human social relationships. We establish
a formalism that can be employed to address social physics problems more effectively by
synthesising the insights derived from both research approaches. This integrative method
advances our understanding of the discipline and paves the way for more accurate and effective
solutions.
Through the combination of field research, network modelling and the extension of these
models to other species, this thesis makes a substantial contribution to the field of social
physics. Our research provides a solid foundation for future studies and applications aimed
at improving the understanding and management of complex social systems by uncovering
the fundamental mechanisms governing human social behaviour.Programa de Doctorado en IngenierĂa Matemática por la Universidad Carlos III de MadridPresidente: Luis Mario Floria Peralta.- Secretario: Alberto Antonioni.- Vocal: MarĂa Pereda GarcĂ
Graph Learning and Its Applications: A Holistic Survey
Graph learning is a prevalent domain that endeavors to learn the intricate
relationships among nodes and the topological structure of graphs. These
relationships endow graphs with uniqueness compared to conventional tabular
data, as nodes rely on non-Euclidean space and encompass rich information to
exploit. Over the years, graph learning has transcended from graph theory to
graph data mining. With the advent of representation learning, it has attained
remarkable performance in diverse scenarios, including text, image, chemistry,
and biology. Owing to its extensive application prospects, graph learning
attracts copious attention from the academic community. Despite numerous works
proposed to tackle different problems in graph learning, there is a demand to
survey previous valuable works. While some researchers have perceived this
phenomenon and accomplished impressive surveys on graph learning, they failed
to connect related objectives, methods, and applications in a more coherent
way. As a result, they did not encompass current ample scenarios and
challenging problems due to the rapid expansion of graph learning. Different
from previous surveys on graph learning, we provide a holistic review that
analyzes current works from the perspective of graph structure, and discusses
the latest applications, trends, and challenges in graph learning.
Specifically, we commence by proposing a taxonomy from the perspective of the
composition of graph data and then summarize the methods employed in graph
learning. We then provide a detailed elucidation of mainstream applications.
Finally, based on the current trend of techniques, we propose future
directions.Comment: 20 pages, 7 figures, 3 table
Multi-Criteria Decision Making in Complex Decision Environments
In the future, many decisions will either be fully automated or supported by autonomous system. Consequently, it is of high importance that we understand how to integrate human preferences correctly. This dissertation dives into the research field of multi-criteria decision making and investigates the satellite image acquisition scheduling problem and the unmanned aerial vehicle routing problem to further the research on a priori preference integration frameworks. The work will aid in the transition towards autonomous decision making in complex decision environments. A discussion on the future of pairwise and setwise preference articulation methods is also undertaken. "Simply put, a direct consequence of the improved decision-making methods is,that bad decisions more clearly will stand out as what they are - bad decisions.
Models, services and security in modern online social networks
Modern online social networks have revolutionized the world the same way the radio and the plane did, crossing geographical and time boundaries, not without problems, more can be learned, they can still change our world and that their true worth is still a question for the future
- …