18 research outputs found
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Approximation contexts in addressing graph data structures
While the application of machine learning algorithms to practical problems has been expanded from fixed sized input data to sequences, trees or graphs input data, the composition of learning system has developed from a single model to integrated ones. Recent advances in graph based learning algorithms include: the SOMSD (Self Organizing Map for Structured Data), PMGraphSOM (Probability Measure Graph Self Organizing Map,GNN (Graph Neural Network) and GLSVM (Graph Laplacian Support Vector Machine). A main motivation of this thesis is to investigate if such algorithms, whether by themselves individually or modified, or in various combinations, would provide better performance over the more traditional artificial neural networks or kernel machine methods on some practical challenging problems. More succinctly, this thesis seeks to answer the main research question: when or under what conditions/contexts could graph based models be adjusted and tailored to be most efficacious in terms of predictive or classification performance on some challenging practical problems? There emerges a range of sub-questions including: how do we craft an effective neural learning system which can be an integration of several graph and non-graph based models? Integration of various graph based and non graph based kernel machine algorithms; enhancing the capability of the integrated model in working with challenging problems; tackling the problem of long term dependency issues which aggravate the performance of layer-wise graph based neural systems. This thesis will answer these questions.
Recent research on multiple staged learning models has demonstrated the efficacy of multiple layers of alternating unsupervised and supervised learning approaches. This underlies the very successful front-end feature extraction techniques in deep neural networks. However much exploration is still possible with the investigation of the number of layers required, and the types of unsupervised or supervised learning models which should be used. Such issues have not been considered so far, when the underlying input data structure is in the form of a graph. We will explore empirically the capabilities of models of increasing complexities, the combination of the unsupervised learning algorithms, SOM, or PMGraphSOM, with or without a cascade connection with a multilayer perceptron, and with or without being followed by multiple layers of GNN. Such studies explore the effects of including or ignoring context. A parallel study involving kernel machines with or without graph inputs has also been conducted empirically
Recommended from our members
Controlling the Fairness / Accuracy Tradeoff in Recommender Systems
Recommender systems are one of the most pervasive applications of machine learning. They play a pivotal role in helping users find items tailored to their taste. Although these systems intend to assist people in their information needs, they can cause implicit or explicit discrimination against individuals or groups. There are several ways that different biases can creep into recommender systems. Reflection of societal and historical prejudices in datasets and during the data collection process, lack of sufficient data on minority groups, lack of suitable evaluation methods and model designs to detect these biases and lessen the unfairness caused by them are among the many reasons for unfairness in these systems. A system needs to defend against the biases in recommendation output to prevent harm and unfairness. However, integrating the goal of fairness with accuracy in recommender systems is challenging, primarily because of this goal's significant trade-offs with accuracy. Accuracy in recommender systems is the ability of that system to predict users' needs and interests accurately. On the other hand, fairness is a complicated concept with a variety of definitions. To use fairness as an objective, we need to define it based on the application area and the context of a problem. Additionally, we need to specify the fairness concerns of the different stakeholders involved in the recommender systems and the fairness priorities of a system. Any of these aspects might disagree with the goal of accuracy. For example, if fairness for content providers is more exposure to users, increasing it might cause a reduction in accuracy. Therefore, controlling the trade-off between accuracy and fairness becomes essential. Throughout this dissertation, several recommendation models and re-ranking approaches are presented that aim to address this problem using in- and post- processing methods. These approaches show promising results, but it is worth mentioning that they have intrinsic limitations and, therefore, shouldn't be considered ultimate solutions
Statistical physics approaches to large-scale socio-economic networks
Die statistische Physik erforschte im letzten Jahrzehnt eine FĂŒlle von wissenschaftlichen Gebieten, was zu einem besseren quantitativen VerstĂ€ndnis von verschiedenen, aus vielen Elementen bestehenden Systemen, z.B. von sozialen Systemen, gefĂŒhrt hat. Eine empirische Quantifizierung von menschlichem Verhalten auf gesellschaftlichem Niveau hat sich allerdings bisher als sehr schwierig erwiesen, wegen Problemen bei der Gewinnung und QualitĂ€t von Daten. In dieser Doktorarbeit erstellen wir zum ersten mal einen umfangreichen ĂŒber fĂŒnf Jahre gesammelten Datensatz, der praktisch alle Aktionen und Eigenschaften der 350.000 Teilnehmer einer gesamten menschlichen Gesellschaft aus einem selbstentwickelten Massive Multiplayer Online Game enthĂ€lt. Wir beschreiben dieses aus stark wechselwirkenden Spielern bestehende soziale System in drei Ebenen. In einem ersten Schritt analysieren wir die Individuen und deren Verhalten im Verlauf der Zeit. Eine Skalen- und Fluktuationsanalyse von Aktions-Reaktions-Zeitreihen enthĂŒllt Persistenz der möglichen Aktionen und qualitative Unterschiede zwischen "guten" und "schlechten" Spielern. Wir untersuchen danach den Diffusionsprozess der im Spieluniversum stattfindenden Bewegungen der Individuen. Wir finden SubdiffusivitĂ€t und eine durch ein Potenzgesetz verteilte PrĂ€ferenz zu kĂŒrzlich besuchten Orten zurĂŒckzukehren. Zweitens, auf der nĂ€chsthöheren Ebene, verwenden wir Netzwerktheorie um die topologische Struktur der Interaktionen zwischen Individuen zu quantifizieren. Wir konzentrieren uns auf sechs durch direkte Interaktionen definierte Netzwerke, drei davon positiv (Handel, Freundschaft, Kommunikation), drei negativ (Feindschaft, Attacke, Bestrafung). Diese Netzwerke weisen nichttriviale statistische Eigenschaften auf, z.B. skaleninvariante Topologie, und entwickeln sich in der Zeit, was uns erlaubt eine Reihe von Hypothesen ĂŒber sozialdynamische PhĂ€nomene zu testen. Wir finden qualitative Unterschiede zwischen positiven und negativen Netzwerken in Evolution und Struktur. SchlieĂlich untersuchen wir das Multiplex-Netzwerk der Spielergesellschaft, das sich aus den einzelnen Netzwerk-Schichten zusammensetzt. Wir quantifizieren Interaktionen zwischen verschiedenen Netzwerken und zeigen die nichttrivialen Organisationsprinzipien auf die auch in echten menschlichen Gesellschaften beobachtet wurden. Unsere Erkenntnisse liefern Belege fĂŒr die Hypothese der strukturellen Balance, die eine Vermeidung von gewissen frustrierten ZustĂ€nden auf mikroskopischem Niveau postuliert. Mit diesem Aufbau demonstrieren wir die Möglichkeit der Gewinnung neuartiger wissenschaftlicher Erkenntnisse ĂŒber die Natur von kollektivem menschlichen Verhalten in groĂangelegten sozialen Systemen.In the past decade a variety of fields has been explored by statistical physicists, leading to an increase of our quantitative understanding of various systems composed of many interacting elements, such as social systems. However, an empirical quantification of human behavior on a societal level has so far proved to be tremendously difficult due to problems in data availability, quality and ways of acquisition. In this doctoral thesis we compile for the first time a large-scale data set consisting of practically all actions and properties of 350,000 odd participants of an entire human society interacting in a self-developed Massive Multiplayer Online Game, over a period of five years. We describe this social system composed of strongly interacting players in the game in three consecutive levels. In a first step, we examine the individuals and their behavioral properties over time. A scaling and fluctuation analysis of action-reaction time-series reveals persistence of the possible actions and qualitative differences between "good" and "bad" players. We then study and model the diffusion process of human mobility occurring within the "game universe". We find subdiffusion and a power-law distributed preference to return to more recently visited locations. Second, on a higher level, we use network theory to quantify the topological structure of interactions between the individuals. We focus on six network types defined by direct interactions, three of them with a positive connotation (trade, friendship, communication), three with a negative one (enmity, attack, punishment). These networks exhibit non-trivial statistical properties, e.g. scale-free topology, and evolve over time, allowing to test a series of long-standing social-dynamics hypotheses. We find qualitative differences in evolution and topological structure between positive and negative tie networks. Finally, on a yet higher level, we consider the multiplex network of the player society, constituted by the coupling of the single network layers. We quantify interactions between different networks and detect the non-trivial organizational principles which lead to the observed structure of the system and which have been observed in real human societies as well. Our findings with the multiplex framework provide evidence for the half-century old hypothesis of structural balance, where certain frustrated states on a microscopic level tend to be avoided. Within this setup we demonstrate the feasibility for generating novel scientific insights on the nature of collective human behavior in large-scale social systems
Enhancing trustability in MMOGs environments
Massively Multiplayer Online Games (MMOGs; e.g., World of Warcraft), virtual worlds
(VW; e.g., Second Life), social networks (e.g., Facebook) strongly demand for more
autonomic, security, and trust mechanisms in a way similar to humans do in the real
life world. As known, this is a difficult matter because trusting in humans and organizations
depends on the perception and experience of each individual, which is difficult to
quantify or measure. In fact, these societal environments lack trust mechanisms similar
to those involved in humans-to-human interactions. Besides, interactions mediated
by compute devices are constantly evolving, requiring trust mechanisms that keep the
pace with the developments and assess risk situations.
In VW/MMOGs, it is widely recognized that users develop trust relationships from their
in-world interactions with others. However, these trust relationships end up not being
represented in the data structures (or databases) of such virtual worlds, though they
sometimes appear associated to reputation and recommendation systems. In addition,
as far as we know, the user is not provided with a personal trust tool to sustain his/her
decision making while he/she interacts with other users in the virtual or game world.
In order to solve this problem, as well as those mentioned above, we propose herein a
formal representation of these personal trust relationships, which are based on avataravatar
interactions. The leading idea is to provide each avatar-impersonated player
with a personal trust tool that follows a distributed trust model, i.e., the trust data is
distributed over the societal network of a given VW/MMOG.
Representing, manipulating, and inferring trust from the user/player point of view certainly
is a grand challenge. When someone meets an unknown individual, the question
is âCan I trust him/her or not?â. It is clear that this requires the user to have access to
a representation of trust about others, but, unless we are using an open source VW/MMOG,
it is difficult ânot to say unfeasibleâ to get access to such data. Even, in an open
source system, a number of users may refuse to pass information about its friends, acquaintances,
or others. Putting together its own data and gathered data obtained from
others, the avatar-impersonated player should be able to come across a trust result
about its current trustee. For the trust assessment method used in this thesis, we use
subjective logic operators and graph search algorithms to undertake such trust inference
about the trustee. The proposed trust inference system has been validated using
a number of OpenSimulator (opensimulator.org) scenarios, which showed an accuracy
increase in evaluating trustability of avatars.
Summing up, our proposal aims thus to introduce a trust theory for virtual worlds, its
trust assessment metrics (e.g., subjective logic) and trust discovery methods (e.g.,
graph search methods), on an individual basis, rather than based on usual centralized
reputation systems. In particular, and unlike other trust discovery methods, our methods
run at interactive rates.MMOGs (Massively Multiplayer Online Games, como por exemplo, World of Warcraft),
mundos virtuais (VW, como por exemplo, o Second Life) e redes sociais (como por exemplo,
Facebook) necessitam de mecanismos de confiança mais autónomos, capazes de
assegurar a segurança e a confiança de uma forma semelhante à que os seres humanos
utilizam na vida real. Como se sabe, esta nĂŁo Ă© uma questĂŁo fĂĄcil. Porque confiar em
seres humanos e ou organizaçÔes depende da percepção e da experiĂȘncia de cada indivĂduo,
o que Ă© difĂcil de quantificar ou medir Ă partida. Na verdade, esses ambientes
sociais carecem dos mecanismos de confiança presentes em interacçÔes humanas presenciais.
Além disso, as interacçÔes mediadas por dispositivos computacionais estão em
constante evolução, necessitando de mecanismos de confiança adequados ao ritmo da
evolução para avaliar situaçÔes de risco.
Em VW/MMOGs, é amplamente reconhecido que os utilizadores desenvolvem relaçÔes
de confiança a partir das suas interacçÔes no mundo com outros. No entanto, essas relaçÔes
de confiança acabam por não ser representadas nas estruturas de dados (ou bases
de dados) do VW/MMOG especĂfico, embora Ă s vezes apareçam associados Ă reputação
e a sistemas de reputação. Além disso, tanto quanto sabemos, ao utilizador não lhe
é facultado nenhum mecanismo que suporte uma ferramenta de confiança individual
para sustentar o seu processo de tomada de decisĂŁo, enquanto ele interage com outros
utilizadores no mundo virtual ou jogo. A fim de resolver este problema, bem como
os mencionados acima, propomos nesta tese uma representação formal para essas relaçÔes
de confiança pessoal, baseada em interacçÔes avatar-avatar. A ideia principal
é fornecer a cada jogador representado por um avatar uma ferramenta de confiança
pessoal que segue um modelo de confiança distribuĂda, ou seja, os dados de confiança
sĂŁo distribuĂdos atravĂ©s da rede social de um determinado VW/MMOG.
Representar, manipular e inferir a confiança do ponto de utilizador/jogador, é certamente
um grande desafio. Quando alguĂ©m encontra um indivĂduo desconhecido, a
pergunta Ă© âPosso confiar ou nĂŁo nele?â. Ă claro que isto requer que o utilizador tenha
acesso a uma representação de confiança sobre os outros, mas, a menos que possamos
usar uma plataforma VW/MMOG de cĂłdigo aberto, Ă© difĂcil â para nĂŁo dizer impossĂvel
â obter acesso aos dados gerados pelos utilizadores. Mesmo em sistemas de cĂłdigo
aberto, um nĂșmero de utilizadores pode recusar partilhar informaçÔes sobre seus amigos,
conhecidos, ou sobre outros. Ao juntar seus prĂłprios dados com os dados obtidos de
outros, o utilizador/jogador representado por um avatar deve ser capaz de produzir uma
avaliação de confiança sobre o utilizador/jogador com o qual se encontra a interagir.
Relativamente ao método de avaliação de confiança empregue nesta tese, utilizamos
lógica subjectiva para a representação da confiança, e também operadores lógicos da
lĂłgica subjectiva juntamente com algoritmos de procura em grafos para empreender
o processo de inferĂȘncia da confiança relativamente a outro utilizador. O sistema de
inferĂȘncia de confiança proposto foi validado atravĂ©s de um nĂșmero de cenĂĄrios Open-Simulator (opensimulator.org), que mostrou um aumento na precisĂŁo na avaliação da
confiança de avatares.
Resumindo, a nossa proposta visa, assim, introduzir uma teoria de confiança para mundos
virtuais, conjuntamente com métricas de avaliação de confiança (por exemplo, a
lógica subjectiva) e em métodos de procura de caminhos de confiança (com por exemplo,
através de métodos de pesquisa em grafos), partindo de uma base individual, em
vez de se basear em sistemas habituais de reputação centralizados. Em particular, e ao
contrårio de outros métodos de determinação do grau de confiança, os nossos métodos
sĂŁo executados em tempo real
Integrating Distributional, Compositional, and Relational Approaches to Neural Word Representations
When the field of natural language processing (NLP) entered the era of deep neural networks, the task of representing basic units of language, an inherently sparse and symbolic medium, using low-dimensional dense real-valued vectors, or embeddings, became crucial.
The dominant technique to perform this task has for years been to segment input text sequences into space-delimited words, for which embeddings are trained over a large corpus by means of leveraging distributional information: a word is reducible to the set of contexts it appears in.
This approach is powerful but imperfect; words not seen during the embedding learning phase, known as out-of-vocabulary words (OOVs), emerge in any plausible application where embeddings are used.
One approach applied in order to combat this and other shortcomings is the incorporation of compositional information obtained from the surface form of words, enabling the representation of morphological regularities and increasing robustness to typographical errors.
Another approach leverages word-sense information and relations curated in large semantic graph resources, offering a supervised signal for embedding space structure and improving representations for domain-specific rare words.
In this dissertation, I offer several analyses and remedies for the OOV problem based on the utilization of character-level compositional information in multiple languages and the structure of semantic knowledge in English.
In addition, I provide two novel datasets for the continued exploration of vocabulary expansion in English: one with a taxonomic emphasis on novel word formation, and the other generated by a real-world data-driven use case in the entity graph domain.
Finally, recognizing the recent shift in NLP towards contextualized representations of subword tokens, I describe the form in which the OOV problem still appears in these methods, and apply an integrative compositional model to address it.Ph.D
Knowledge discovery in databases at a conceptual level
Wissensentdeckung in Datenbanken (engl. Knowledge Discovery in
Databases, KDD) ist die Bezeichnung fĂŒr einen nichttrivialen
Prozess, der im Kern eine oder mehrere Anwendungen eines
Algorithmus aus dem Maschinellen Lernen auf echte Daten beinhaltet.
Vorbereitende Schritte in diesem Prozess bereiten die Beispiele,
aus denen gelernt wird, auf, erstellen also die Beispiel-
ReprÀsentationssprache. Nachfolgende Schritte wenden die gelernten
Ergebnisse auf neue Daten an. In dieser Arbeit wird der gesamte
Prozess auf einer konzeptuellen (begrifflichen) Ebene analysiert.
AuĂerdem wird MiningMart beschrieben, eine Software, die den
gesamten Prozess unterstĂŒtzt, aber den Fokus auf die Vorverarbeitung
der Daten legt. Diese Vorverarbeitungsphase ist die zeitintensivste
Phase des Wissensentdeckungsprozesses. Sie wird durch die BeitrÀge
dieser Arbeit umfassend und auf neuartige Weise unterstĂŒtzt. Im
Ergebnis lĂ€sst sich der Aufwand fĂŒr Benutzer bei der Erstellung,
beim Rapid Prototyping, bei der Modellierung, AusfĂŒhrung,
Veröffentlichung und Wiederverwendung von KDD-Prozessen deutlich
reduzieren.Knowledge Discovery in Databases (KDD) is a nontrivial process
centered around one or more applications of a Machine Learning
algorithm to real world data. Steps leading towards this central
step prepare the examples from which the algorithm learns, and
thus create the example representation language. Steps following
the central step may deploy the learned results to new data. In
this thesis, the complete process is described from a conceptual
view, and the MiningMart software is presented which supports the
whole process, but puts its focus on data preparation for KDD. This
preparation phase is the most time-consuming part of the process,
and is comprehensively supported in new ways by the contributions
towards MiningMart made in this thesis. The result are greatly
reduced user efforts for rapid prototyping, modelling, execution,
publication and re-use of KDD processes
Novel approaches for hierarchical classification with case studies in protein function prediction
A very large amount of research in the data mining, machine learning, statistical pattern recognition and related research communities has focused on flat classification problems. However, many problems in the real world such as hierarchical protein function prediction have their classes naturally organised into hierarchies. The task of hierarchical classification, however, needs to be better defined as researchers into one application domain are often unaware of similar efforts developed in other research areas.
The first contribution of this thesis is to survey the task of hierarchical classification across different application domains and present an unifying framework for the task. After clearly defining the problem, we explore novel approaches to the task.
Based on the understanding gained by surveying the task of hierarchical classification, there are three major approaches to deal with hierarchical classification problems. The first approach is to use one of the many existing flat classification algorithms to predict only the leaf classes in the hierarchy. Note that, in the training phase, this approach completely ignores the hierarchical class relationships, i.e. the parent-child and sibling class relationships, but in the testing phase the ancestral classes of an instance can be inferred from its predicted leaf classes. The second approach is to build a set of local models, by training one flat classification algorithm for each local view of the hierarchy. The two main variations of this approach are: (a) training a local flat multi-class classifier at each non-leaf class node, where each classifier discriminates among the child classes of its associated class; or (b) training a local fiat binary classifier at each node of the class hierarchy, where each classifier predicts whether or not a new instance has the classifierâs associated class. In both these variations, in the testing phase a procedure is used to combine the predictions of the set of local classifiers in a coherent way, avoiding inconsistent predictions. The third approach is to use a global-model hierarchical classification algorithm, which builds one single classification model by taking into account all the hierarchical class relationships in the training phase. In the context of this categorization of hierarchical classification approaches, the other contributions of this thesis are as follows.
The second contribution of this thesis is a novel algorithm which is based on the local classifier per parent node approach. The novel algorithm is the selective representation approach that automatically selects the best protein representation to use at each non-leaf class node.
The third contribution is a global-model hierarchical classification extension of the well known naive Bayes algorithm. Given the good predictive performance of the global-model hierarchical-classification naive Bayes algorithm, we relax the Naive Bayesâ assumption that attributes are independent from each other given the class by using the concept of k dependencies. Hence, we extend the flat classification /Âż-Dependence Bayesian network classifier to the task of hierarchical classification, which is the fourth contribution of this thesis.
Both the proposed global-model hierarchical classification Naive Bayes and the proposed global-model hierarchical /Âż-Dependence Bayesian network classifier have achieved predictive accuracies that were, overall, significantly higher than the predictive accuracies obtained by their corresponding local hierarchical classification versions, across a number of datasets for the task of hierarchical protein function prediction