104 research outputs found
Fixed-Parameter Algorithms for Computing Kemeny Scores - Theory and Practice
The central problem in this work is to compute a ranking of a set of elements
which is "closest to" a given set of input rankings of the elements. We define
"closest to" in an established way as having the minimum sum of Kendall-Tau
distances to each input ranking. Unfortunately, the resulting problem Kemeny
consensus is NP-hard for instances with n input rankings, n being an even
integer greater than three. Nevertheless this problem plays a central role in
many rank aggregation problems. It was shown that one can compute the
corresponding Kemeny consensus list in f(k) + poly(n) time, being f(k) a
computable function in one of the parameters "score of the consensus", "maximum
distance between two input rankings", "number of candidates" and "average
pairwise Kendall-Tau distance" and poly(n) a polynomial in the input size. This
work will demonstrate the practical usefulness of the corresponding algorithms
by applying them to randomly generated and several real-world data. Thus, we
show that these fixed-parameter algorithms are not only of theoretical
interest. In a more theoretical part of this work we will develop an improved
fixed-parameter algorithm for the parameter "score of the consensus" having a
better upper bound for the running time than previous algorithms.Comment: Studienarbei
Graphlet based network analysis
The majority of the existing works on network analysis, study properties that are related to the global topology of a network. Examples of such properties include diameter, power-law exponent, and spectra of graph Laplacians. Such works enhance our understanding of real-life networks, or enable us to generate synthetic graphs with real-life graph properties. However, many of the existing problems on networks require the study of local topological structures of a network.
Graphlets which are induced small subgraphs capture the local topological structure of a network effectively. They are becoming increasingly popular for characterizing large networks in recent years. Graphlet based network analysis can vary based on the types of topological structures considered and the kinds of analysis tasks. For example, one of the most popular and early graphlet analyses is based on triples (triangles or paths of length two). Graphlet analysis based on cycles and cliques are also explored in several recent works. Another more comprehensive class of graphlet analysis methods works with graphlets of specific sizesâgraphlets with three, four or five nodes ({3, 4, 5}-Graphlets) are particularly popular. For all the above analysis tasks, excessive computational cost is a major challenge, which becomes severe for analyzing large networks with millions of vertices. To overcome this challenge, effective methodologies are in urgent need. Furthermore, the existence of efficient methods for graphlet analysis will encourage more works broadening the scope of graphlet analysis.
For graphlet counting, we propose edge iteration based methods (ExactTC and ExactGC) for efficiently computing triple and graphlet counts. The proposed methods compute local graphlet statistics in the neighborhood of each edge in the network and then aggregate the local statistics to give the global characterization (transitivity, graphlet frequency distribution (GFD), etc) of the network. Scalability of the proposed methods is further improved by iterating over a sampled set of edges and estimating the triangle count (ApproxTC) and graphlet count (Graft) by approximate rescaling of the aggregated statistics. The independence of local feature vector construction corresponding to each edge makes the methods embarrassingly parallelizable. We show this by giving a parallel edge iteration method ParApproxTC for triangle counting.
For graphlet sampling, we propose Markov Chain Monte Carlo (MCMC) sampling based methods for triple and graphlet analysis. Proposed triple analysis methods, Vertex-MCMC and Triple-MCMC, estimate triangle count and network transitivity. Vertex-MCMC samples triples in two steps. First, the method selects a node (using the MCMC method) with probability proportional to the number of triples of which the node is a center. Then Vertex-MCMC samples uniformly from the triples centered by the selected node. The method Triple-MCMC samples triples by performing a MCMC walk in a triple sample space. Triple sample space consists of all the possible triples in a network. MCMC method performs triple sampling by walking form one triple to one of its neighboring triples in the triple space. We design the triple space in such a way that two triples are neighbors only if they share exactly two nodes.
The proposed triple sampling algorithms Vertex-MCMC and Triple-MCMC are able to sample triples from any arbitrary distribution, as long as the weight of each triple is locally computable. The proposed methods are able to sample triples without the knowledge of the complete network structure. Information regarding only the local neighborhood structure of currently observed node or triple are enough to walk to the next node or triple. This gives the proposed methods a significant advantage: the capability to sample triples from networks that have restricted access, on which a direct sampling based method is simply not applicable. The proposed methods are also suitable for dynamic and large networks. Similar to the concept of Triple-MCMC, we propose Guise for sampling graphlets of sizes three, four and five ({3, 4, 5}-Graphlets). Guise samples graphlets, by performing a MCMC walk on a graphlet sample space, containing all the graphlets of sizes three, four and five in the network.
Despite the proven utility of graphlets in static network analysis, works harnessing the ability of graphlets for dynamic network analysis are yet to come. Dynamic networks contain additional time information for their edges. With time, the topological structure of a dynamic network changesâedges can appear, disappear and reappear over time. In this direction, predicting the link state of a network at a future time, given a collection of link states at earlier times, is an important task with many real-life applications. In the existing literature, this task is known as link prediction in dynamic networks. Performing this task is more difficult than its counterpart in static networks because an effective feature representation of node-pair instances for the case of a dynamic network is hard to obtain.
We design a novel graphlet transition based feature embedding for node-pair instances of a dynamic network. Our proposed method GraTFEL, uses automatic feature learning methodologies on such graphlet transition based features to give a low-dimensional feature embedding of unlabeled node-pair instances. The feature learning task is modeled as an optimal coding task where the objective is to minimize the reconstruction error. GraTFEL solves this optimization task by using a gradient descent method. We validate the effectiveness of the learned optimal feature embedding by utilizing it for link prediction in real-life dynamic networks. Specifically, we show that GraTFEL, which uses the extracted feature embedding of graphlet transition events, outperforms existing methods that use well-known link prediction features
Recommended from our members
Higher-order structure in networks: construction and its impact on dynamics
Networks are often characterised in terms of their degree distribution and global clustering coefficient. It is assumed that these provide a sufficient parametrisation of networks. However, since the global clustering coefficient is only sensitive to the total number of triangles found in the network, it is evident that two networks could have the same number of triangles but significantly different higher-order structure, i.e., the topologies that result from the placement of closed subgraphs around nodes. The two main objectives of my work are: (1) developing network generating algorithms and network based epidemic models with controllable higher-order structure and (2) investigating the impact of higher-order structure on dynamics on networks. This thesis is based on three papers, corresponding to Chapters. 3, 4 and 5. Chapter 3 presents a novel higher-order structure based network generating algorithm and subgraph counting algorithm. Chapter. 4, generalises a previously proposed ODE model that accurately captures the time evolution of the susceptible-infected-recovered (SIR) dynamics on networks constructed using arbitrary subgraphs. Chapter. 5, improves, extends and generalises the network generating algorithms proposed in the previous two papers. All three chapters demonstrate that for a fixed degree distribution and global clustering, diverse higher-order structure is still possible and that this structure will impact significantly on dynamics unfolding on networks. Hence, we suggest that higher-order structure should receive more attention when analysing network-based systems and dynamics
Degenerations of Lie algebras and pre-Lie algebras
In dieser Arbeit beschĂftigen wir uns mit dem Orbitabschlussproblem fĂr Algebren in der Theorie der algebraischen Transformationsgruppen.
Die allgemeine lineare Gruppe Gl(V) ĂŒber einem Körper K operiert auf dem Vektorraum Alg(C), dem Raum aller
K-Algebra-Strukturen, durch Basiswechsel. Liegt bezĂŒglich der Zariski-Topologie eine K-Algebra-Struktur B im Orbitabschluss einer
K-Algebra-Struktur A so spricht man von einer Degeneration A nach B. Das Orbitabschlussproblem in dieser Form stellt die Frage nach der
Klassifikation aller Degenerationen einer bestimmten Algebra-Struktur in einer fixen Dimension.
In der vorliegenden Arbeit werden alle Degenerationen von Novikovalgebren ĂŒber C in der Dimension drei klassifiziert. Jene Algebren bilden eine
Unterklasse linkssymmetrischer Algebren, sogenannter pre-Liealgebren, deren sÀmtliche Degenerationen wir in der Dimension zwei
bestimmen. Ăberraschenderweise ist dies bereits sehr aufwendig. Gibt es in dieser Dimension lediglich zwei nicht isomorphe Liealgebren so haben wir
unendlich viele nicht isomorphe 2-dimensionale pre-Liealgebren. Wegen der groĂen Anzahl an Algebren und damit verbunden eine noch gröĂere Anzahl
möglicher Degenerationen haben sich beide Klassifikationen als Ă€uĂerst umfangreich erwiesen.
Um diese Ziele zu erreichen werden bekannte Methoden zum Studium von Liealgebra-Degenerationen auf die Klasse der pre-Liealgebren verallgemeinert und
erweitert. Bei diesen handelt es sich beispielsweise um die Cpq-Invariante und um Semi-Invarianten wie etwa die Dimension des Zentrums einer Algebra.
Weiters werden Semi-Invarianten eingefĂŒhrt, die speziell auf den Fall von pre-Lie- bzw. Novikov-Algebren anwendbar sind. DarĂŒber hinaus werden neue Resultate
bewiesen, welche Degenerationen unterschiedlicher Dimension in Zusammenhang setzen. Es konnte beispielsweise gezeigt werden, dass im Falle einer
Degeneration zweier gegebener Algebren A nach B auch alle Faktoren A / I mit einem beliebigen Ideal I in A gegen entsprechende Faktoren der
Algebra B degenerieren mĂŒssen.In this thesis we are concerned with the orbit closure problem for algebras in algebraic transformation group theory. The general linear group Gl(V)
over a field K acts on the vector space Alg(C), the space of K-algebra structures, by the change of basis. For two K-algebra
structures A and B we say that B is a degeneration of A if B lies in the orbit closure of A with respect to the Zariski topology.
The orbit closure problem in this form is about the classification of all degenerations of a certain algebra structure
in a fixed dimension.
The main result in this work is the classification of all degenerations of Novikov algebras over C in dimension three. Such algebras form a subclass of
left-symmetric algebras, so called pre-Lie algebras. Approaching this we also give the complete classification of 2-dimensional pre-Lie algebras.
This is surprisingly complicated. For example in dimension two there are only two non-isomorphic Lie algebras. However, we have already infinitely
many 2-dimensional pre-Lie algebras. Both classifications turn out to be very extensive.
To reach these goals we generalize and enlarge methods that were applied in the case of Lie algebra degenerations. For example the Cpq-invariant and
semi-invariants like the dimension of the center of an algebra are of that kind. Thereby we establish semi-invariants that are characteristic for the
type of pre-Lie and Novikov algebras. Furthermore we bring new results that show the relation between degenerations in different dimensions. A substantial
statement in this direction is that in case of a degeneration of two given algebras, where A degenerates to B also all factors A /I formed by an arbitrary ideal I
in A have to degenerate to corresponding factors of the algebra B
Uncertain Multi-Criteria Optimization Problems
Most real-world search and optimization problems naturally involve multiple criteria as objectives. Generally, symmetry, asymmetry, and anti-symmetry are basic characteristics of binary relationships used when modeling optimization problems. Moreover, the notion of symmetry has appeared in many articles about uncertainty theories that are employed in multi-criteria problems. Different solutions may produce trade-offs (conflicting scenarios) among different objectives. A better solution with respect to one objective may compromise other objectives. There are various factors that need to be considered to address the problems in multidisciplinary research, which is critical for the overall sustainability of human development and activity. In this regard, in recent decades, decision-making theory has been the subject of intense research activities due to its wide applications in different areas. The decision-making theory approach has become an important means to provide real-time solutions to uncertainty problems. Theories such as probability theory, fuzzy set theory, type-2 fuzzy set theory, rough set, and uncertainty theory, available in the existing literature, deal with such uncertainties. Nevertheless, the uncertain multi-criteria characteristics in such problems have not yet been explored in depth, and there is much left to be achieved in this direction. Hence, different mathematical models of real-life multi-criteria optimization problems can be developed in various uncertain frameworks with special emphasis on optimization problems
Efficient Axiomatization of OWL 2 EL Ontologies from Data by means of Formal Concept Analysis: (Extended Version)
We present an FCA-based axiomatization method that produces a complete EL TBox (the terminological part of an OWL 2 EL ontology) from a graph dataset in at most
exponential time. We describe technical details that allow for efficient implementation as well as variations that dispense with the computation of extremely large axioms, thereby
rendering the approach applicable albeit some completeness is lost. Moreover, we evaluate the prototype on real-world datasets.This is an extended version of an article accepted at AAAI 2024
Impact of Symmetries in Graph Clustering
Diese Dissertation beschĂ€ftigt sich mit der durch die Automorphismusgruppe definierten Symmetrie von Graphen und wie sich diese auf eine Knotenpartition, als Ergebnis von Graphenclustering, auswirkt. Durch eine Analyse von nahezu 1700 Graphen aus verschiedenen Anwendungsbereichen kann gezeigt werden, dass mehr als 70 % dieser Graphen Symmetrien enthalten. Dies bildet einen Gegensatz zum kombinatorischen Beweis, der besagt, dass die Wahrscheinlichkeit eines zufĂ€lligen Graphen symmetrisch zu sein bei zunehmender GröĂe gegen Null geht. Das Ergebnis rechtfertigt damit die Wichtigkeit weiterer Untersuchungen, die auf mögliche Auswirkungen der Symmetrie eingehen. Bei der Analyse werden sowohl sehr kleine Graphen (10 000 000 Knoten/>25 000 000 Kanten) berĂŒcksichtigt.
Weiterhin wird ein theoretisches Rahmenwerk geschaffen, das zum einen die detaillierte Quantifizierung von Graphensymmetrie erlaubt und zum anderen StabilitĂ€t von Knotenpartitionen hinsichtlich dieser Symmetrie formalisiert. Eine Partition der Knotenmenge, die durch die Aufteilung in disjunkte Teilmengen definiert ist, wird dann als stabil angesehen, wenn keine Knoten symmetriebedingt von der einen in die andere Teilmenge abgebildet werden und dadurch die Partition verĂ€ndert wird. Zudem wird definiert, wie eine mögliche Zerlegbarkeit der Automorphismusgruppe in unabhĂ€ngige Untergruppen als lokale Symmetrie interpretiert werden kann, die dann nur Auswirkungen auf einen bestimmten Bereich des Graphen hat. Um die Auswirkungen der Symmetrie auf den gesamten Graphen und auf Partitionen zu quantifizieren, wird auĂerdem eine Entropiedefinition prĂ€sentiert, die sich an der Analyse dynamischer Systeme orientiert. Alle Definitionen sind allgemein und können daher fĂŒr beliebige Graphen angewandt werden. Teilweise ist sogar eine Anwendbarkeit fĂŒr beliebige Clusteranalysen gegeben, solange deren Ergebnis in einer Partition resultiert und sich eine Symmetrierelation auf den Datenpunkten als Permutationsgruppe angeben lĂ€sst.
Um nun die tatsĂ€chliche Auswirkung von Symmetrie auf Graphenclustering zu untersuchen wird eine zweite Analyse durchgefĂŒhrt. Diese kommt zum Ergebnis, dass von 629 untersuchten symmetrischen Graphen 72 eine instabile Partition haben. FĂŒr die Analyse werden die Definitionen des theoretischen Rahmenwerks verwendet. Es wird auĂerdem festgestellt, dass die LokalitĂ€t der Symmetrie eines Graphen maĂgeblich beeinflusst, ob dessen Partition stabil ist oder nicht. Eine hohe LokalitĂ€t resultiert meist in einer stabilen Partition und eine stabile Partition impliziert meist eine hohe LokalitĂ€t.
Bevor die obigen Ergebnisse beschrieben und definiert werden, wird eine umfassende EinfĂŒhrung in die verschiedenen benötigten Grundlagen gegeben. Diese umfasst die formalen Definitionen von Graphen und statistischen Graphmodellen, Partitionen, endlichen Permutationsgruppen, Graphenclustering und Algorithmen dafĂŒr, sowie von Entropie. Ein separates Kapitel widmet sich ausfĂŒhrlich der Graphensymmetrie, die durch eine endliche Permutationsgruppe, der Automorphismusgruppe, beschrieben wird. AuĂerdem werden Algorithmen vorgestellt, die die Symmetrie von Graphen ermitteln können und, teilweise, auch das damit eng verwandte Graphisomorphie Problem lösen.
Am Beispiel von Graphenclustering gibt die Dissertation damit Einblicke in mögliche Auswirkungen von Symmetrie in der Datenanalyse, die so in der Literatur bisher wenig bis keine Beachtung fanden
Cooperation in self-organized heterogeneous swarms
Cooperation in self-organized heterogeneous swarms is a phenomenon from nature with many applications in autonomous robots. I specifically analyzed the problem of auto-regulated team formation in multi-agent systems and several strategies to learn socially how to make multi-objective decisions. To this end I proposed new multi-objective ranking relations and analyzed their properties theoretically and within multi-objective metaheuristics. The results showed that simple decision mechanism suffice to build effective teams of heterogeneous agents and that diversity in groups is not a problem but can increase the efficiency of multi-agent systems
- âŠ