104 research outputs found

    Fixed-Parameter Algorithms for Computing Kemeny Scores - Theory and Practice

    Full text link
    The central problem in this work is to compute a ranking of a set of elements which is "closest to" a given set of input rankings of the elements. We define "closest to" in an established way as having the minimum sum of Kendall-Tau distances to each input ranking. Unfortunately, the resulting problem Kemeny consensus is NP-hard for instances with n input rankings, n being an even integer greater than three. Nevertheless this problem plays a central role in many rank aggregation problems. It was shown that one can compute the corresponding Kemeny consensus list in f(k) + poly(n) time, being f(k) a computable function in one of the parameters "score of the consensus", "maximum distance between two input rankings", "number of candidates" and "average pairwise Kendall-Tau distance" and poly(n) a polynomial in the input size. This work will demonstrate the practical usefulness of the corresponding algorithms by applying them to randomly generated and several real-world data. Thus, we show that these fixed-parameter algorithms are not only of theoretical interest. In a more theoretical part of this work we will develop an improved fixed-parameter algorithm for the parameter "score of the consensus" having a better upper bound for the running time than previous algorithms.Comment: Studienarbei

    Graphlet based network analysis

    Get PDF
    The majority of the existing works on network analysis, study properties that are related to the global topology of a network. Examples of such properties include diameter, power-law exponent, and spectra of graph Laplacians. Such works enhance our understanding of real-life networks, or enable us to generate synthetic graphs with real-life graph properties. However, many of the existing problems on networks require the study of local topological structures of a network. Graphlets which are induced small subgraphs capture the local topological structure of a network effectively. They are becoming increasingly popular for characterizing large networks in recent years. Graphlet based network analysis can vary based on the types of topological structures considered and the kinds of analysis tasks. For example, one of the most popular and early graphlet analyses is based on triples (triangles or paths of length two). Graphlet analysis based on cycles and cliques are also explored in several recent works. Another more comprehensive class of graphlet analysis methods works with graphlets of specific sizes—graphlets with three, four or five nodes ({3, 4, 5}-Graphlets) are particularly popular. For all the above analysis tasks, excessive computational cost is a major challenge, which becomes severe for analyzing large networks with millions of vertices. To overcome this challenge, effective methodologies are in urgent need. Furthermore, the existence of efficient methods for graphlet analysis will encourage more works broadening the scope of graphlet analysis. For graphlet counting, we propose edge iteration based methods (ExactTC and ExactGC) for efficiently computing triple and graphlet counts. The proposed methods compute local graphlet statistics in the neighborhood of each edge in the network and then aggregate the local statistics to give the global characterization (transitivity, graphlet frequency distribution (GFD), etc) of the network. Scalability of the proposed methods is further improved by iterating over a sampled set of edges and estimating the triangle count (ApproxTC) and graphlet count (Graft) by approximate rescaling of the aggregated statistics. The independence of local feature vector construction corresponding to each edge makes the methods embarrassingly parallelizable. We show this by giving a parallel edge iteration method ParApproxTC for triangle counting. For graphlet sampling, we propose Markov Chain Monte Carlo (MCMC) sampling based methods for triple and graphlet analysis. Proposed triple analysis methods, Vertex-MCMC and Triple-MCMC, estimate triangle count and network transitivity. Vertex-MCMC samples triples in two steps. First, the method selects a node (using the MCMC method) with probability proportional to the number of triples of which the node is a center. Then Vertex-MCMC samples uniformly from the triples centered by the selected node. The method Triple-MCMC samples triples by performing a MCMC walk in a triple sample space. Triple sample space consists of all the possible triples in a network. MCMC method performs triple sampling by walking form one triple to one of its neighboring triples in the triple space. We design the triple space in such a way that two triples are neighbors only if they share exactly two nodes. The proposed triple sampling algorithms Vertex-MCMC and Triple-MCMC are able to sample triples from any arbitrary distribution, as long as the weight of each triple is locally computable. The proposed methods are able to sample triples without the knowledge of the complete network structure. Information regarding only the local neighborhood structure of currently observed node or triple are enough to walk to the next node or triple. This gives the proposed methods a significant advantage: the capability to sample triples from networks that have restricted access, on which a direct sampling based method is simply not applicable. The proposed methods are also suitable for dynamic and large networks. Similar to the concept of Triple-MCMC, we propose Guise for sampling graphlets of sizes three, four and five ({3, 4, 5}-Graphlets). Guise samples graphlets, by performing a MCMC walk on a graphlet sample space, containing all the graphlets of sizes three, four and five in the network. Despite the proven utility of graphlets in static network analysis, works harnessing the ability of graphlets for dynamic network analysis are yet to come. Dynamic networks contain additional time information for their edges. With time, the topological structure of a dynamic network changes—edges can appear, disappear and reappear over time. In this direction, predicting the link state of a network at a future time, given a collection of link states at earlier times, is an important task with many real-life applications. In the existing literature, this task is known as link prediction in dynamic networks. Performing this task is more difficult than its counterpart in static networks because an effective feature representation of node-pair instances for the case of a dynamic network is hard to obtain. We design a novel graphlet transition based feature embedding for node-pair instances of a dynamic network. Our proposed method GraTFEL, uses automatic feature learning methodologies on such graphlet transition based features to give a low-dimensional feature embedding of unlabeled node-pair instances. The feature learning task is modeled as an optimal coding task where the objective is to minimize the reconstruction error. GraTFEL solves this optimization task by using a gradient descent method. We validate the effectiveness of the learned optimal feature embedding by utilizing it for link prediction in real-life dynamic networks. Specifically, we show that GraTFEL, which uses the extracted feature embedding of graphlet transition events, outperforms existing methods that use well-known link prediction features

    An Algorithmic Walk from Static to Dynamic Graph Clustering

    Get PDF

    Degenerations of Lie algebras and pre-Lie algebras

    Get PDF
    In dieser Arbeit beschÄftigen wir uns mit dem Orbitabschlussproblem fÜr Algebren in der Theorie der algebraischen Transformationsgruppen. Die allgemeine lineare Gruppe Gl(V) ĂŒber einem Körper K operiert auf dem Vektorraum Alg(C), dem Raum aller K-Algebra-Strukturen, durch Basiswechsel. Liegt bezĂŒglich der Zariski-Topologie eine K-Algebra-Struktur B im Orbitabschluss einer K-Algebra-Struktur A so spricht man von einer Degeneration A nach B. Das Orbitabschlussproblem in dieser Form stellt die Frage nach der Klassifikation aller Degenerationen einer bestimmten Algebra-Struktur in einer fixen Dimension. In der vorliegenden Arbeit werden alle Degenerationen von Novikovalgebren ĂŒber C in der Dimension drei klassifiziert. Jene Algebren bilden eine Unterklasse linkssymmetrischer Algebren, sogenannter pre-Liealgebren, deren sĂ€mtliche Degenerationen wir in der Dimension zwei bestimmen. Überraschenderweise ist dies bereits sehr aufwendig. Gibt es in dieser Dimension lediglich zwei nicht isomorphe Liealgebren so haben wir unendlich viele nicht isomorphe 2-dimensionale pre-Liealgebren. Wegen der großen Anzahl an Algebren und damit verbunden eine noch grĂ¶ĂŸere Anzahl möglicher Degenerationen haben sich beide Klassifikationen als Ă€ußerst umfangreich erwiesen. Um diese Ziele zu erreichen werden bekannte Methoden zum Studium von Liealgebra-Degenerationen auf die Klasse der pre-Liealgebren verallgemeinert und erweitert. Bei diesen handelt es sich beispielsweise um die Cpq-Invariante und um Semi-Invarianten wie etwa die Dimension des Zentrums einer Algebra. Weiters werden Semi-Invarianten eingefĂŒhrt, die speziell auf den Fall von pre-Lie- bzw. Novikov-Algebren anwendbar sind. DarĂŒber hinaus werden neue Resultate bewiesen, welche Degenerationen unterschiedlicher Dimension in Zusammenhang setzen. Es konnte beispielsweise gezeigt werden, dass im Falle einer Degeneration zweier gegebener Algebren A nach B auch alle Faktoren A / I mit einem beliebigen Ideal I in A gegen entsprechende Faktoren der Algebra B degenerieren mĂŒssen.In this thesis we are concerned with the orbit closure problem for algebras in algebraic transformation group theory. The general linear group Gl(V) over a field K acts on the vector space Alg(C), the space of K-algebra structures, by the change of basis. For two K-algebra structures A and B we say that B is a degeneration of A if B lies in the orbit closure of A with respect to the Zariski topology. The orbit closure problem in this form is about the classification of all degenerations of a certain algebra structure in a fixed dimension. The main result in this work is the classification of all degenerations of Novikov algebras over C in dimension three. Such algebras form a subclass of left-symmetric algebras, so called pre-Lie algebras. Approaching this we also give the complete classification of 2-dimensional pre-Lie algebras. This is surprisingly complicated. For example in dimension two there are only two non-isomorphic Lie algebras. However, we have already infinitely many 2-dimensional pre-Lie algebras. Both classifications turn out to be very extensive. To reach these goals we generalize and enlarge methods that were applied in the case of Lie algebra degenerations. For example the Cpq-invariant and semi-invariants like the dimension of the center of an algebra are of that kind. Thereby we establish semi-invariants that are characteristic for the type of pre-Lie and Novikov algebras. Furthermore we bring new results that show the relation between degenerations in different dimensions. A substantial statement in this direction is that in case of a degeneration of two given algebras, where A degenerates to B also all factors A /I formed by an arbitrary ideal I in A have to degenerate to corresponding factors of the algebra B

    Uncertain Multi-Criteria Optimization Problems

    Get PDF
    Most real-world search and optimization problems naturally involve multiple criteria as objectives. Generally, symmetry, asymmetry, and anti-symmetry are basic characteristics of binary relationships used when modeling optimization problems. Moreover, the notion of symmetry has appeared in many articles about uncertainty theories that are employed in multi-criteria problems. Different solutions may produce trade-offs (conflicting scenarios) among different objectives. A better solution with respect to one objective may compromise other objectives. There are various factors that need to be considered to address the problems in multidisciplinary research, which is critical for the overall sustainability of human development and activity. In this regard, in recent decades, decision-making theory has been the subject of intense research activities due to its wide applications in different areas. The decision-making theory approach has become an important means to provide real-time solutions to uncertainty problems. Theories such as probability theory, fuzzy set theory, type-2 fuzzy set theory, rough set, and uncertainty theory, available in the existing literature, deal with such uncertainties. Nevertheless, the uncertain multi-criteria characteristics in such problems have not yet been explored in depth, and there is much left to be achieved in this direction. Hence, different mathematical models of real-life multi-criteria optimization problems can be developed in various uncertain frameworks with special emphasis on optimization problems

    Efficient Axiomatization of OWL 2 EL Ontologies from Data by means of Formal Concept Analysis: (Extended Version)

    Get PDF
    We present an FCA-based axiomatization method that produces a complete EL TBox (the terminological part of an OWL 2 EL ontology) from a graph dataset in at most exponential time. We describe technical details that allow for efficient implementation as well as variations that dispense with the computation of extremely large axioms, thereby rendering the approach applicable albeit some completeness is lost. Moreover, we evaluate the prototype on real-world datasets.This is an extended version of an article accepted at AAAI 2024

    Impact of Symmetries in Graph Clustering

    Get PDF
    Diese Dissertation beschĂ€ftigt sich mit der durch die Automorphismusgruppe definierten Symmetrie von Graphen und wie sich diese auf eine Knotenpartition, als Ergebnis von Graphenclustering, auswirkt. Durch eine Analyse von nahezu 1700 Graphen aus verschiedenen Anwendungsbereichen kann gezeigt werden, dass mehr als 70 % dieser Graphen Symmetrien enthalten. Dies bildet einen Gegensatz zum kombinatorischen Beweis, der besagt, dass die Wahrscheinlichkeit eines zufĂ€lligen Graphen symmetrisch zu sein bei zunehmender GrĂ¶ĂŸe gegen Null geht. Das Ergebnis rechtfertigt damit die Wichtigkeit weiterer Untersuchungen, die auf mögliche Auswirkungen der Symmetrie eingehen. Bei der Analyse werden sowohl sehr kleine Graphen (10 000 000 Knoten/>25 000 000 Kanten) berĂŒcksichtigt. Weiterhin wird ein theoretisches Rahmenwerk geschaffen, das zum einen die detaillierte Quantifizierung von Graphensymmetrie erlaubt und zum anderen StabilitĂ€t von Knotenpartitionen hinsichtlich dieser Symmetrie formalisiert. Eine Partition der Knotenmenge, die durch die Aufteilung in disjunkte Teilmengen definiert ist, wird dann als stabil angesehen, wenn keine Knoten symmetriebedingt von der einen in die andere Teilmenge abgebildet werden und dadurch die Partition verĂ€ndert wird. Zudem wird definiert, wie eine mögliche Zerlegbarkeit der Automorphismusgruppe in unabhĂ€ngige Untergruppen als lokale Symmetrie interpretiert werden kann, die dann nur Auswirkungen auf einen bestimmten Bereich des Graphen hat. Um die Auswirkungen der Symmetrie auf den gesamten Graphen und auf Partitionen zu quantifizieren, wird außerdem eine Entropiedefinition prĂ€sentiert, die sich an der Analyse dynamischer Systeme orientiert. Alle Definitionen sind allgemein und können daher fĂŒr beliebige Graphen angewandt werden. Teilweise ist sogar eine Anwendbarkeit fĂŒr beliebige Clusteranalysen gegeben, solange deren Ergebnis in einer Partition resultiert und sich eine Symmetrierelation auf den Datenpunkten als Permutationsgruppe angeben lĂ€sst. Um nun die tatsĂ€chliche Auswirkung von Symmetrie auf Graphenclustering zu untersuchen wird eine zweite Analyse durchgefĂŒhrt. Diese kommt zum Ergebnis, dass von 629 untersuchten symmetrischen Graphen 72 eine instabile Partition haben. FĂŒr die Analyse werden die Definitionen des theoretischen Rahmenwerks verwendet. Es wird außerdem festgestellt, dass die LokalitĂ€t der Symmetrie eines Graphen maßgeblich beeinflusst, ob dessen Partition stabil ist oder nicht. Eine hohe LokalitĂ€t resultiert meist in einer stabilen Partition und eine stabile Partition impliziert meist eine hohe LokalitĂ€t. Bevor die obigen Ergebnisse beschrieben und definiert werden, wird eine umfassende EinfĂŒhrung in die verschiedenen benötigten Grundlagen gegeben. Diese umfasst die formalen Definitionen von Graphen und statistischen Graphmodellen, Partitionen, endlichen Permutationsgruppen, Graphenclustering und Algorithmen dafĂŒr, sowie von Entropie. Ein separates Kapitel widmet sich ausfĂŒhrlich der Graphensymmetrie, die durch eine endliche Permutationsgruppe, der Automorphismusgruppe, beschrieben wird. Außerdem werden Algorithmen vorgestellt, die die Symmetrie von Graphen ermitteln können und, teilweise, auch das damit eng verwandte Graphisomorphie Problem lösen. Am Beispiel von Graphenclustering gibt die Dissertation damit Einblicke in mögliche Auswirkungen von Symmetrie in der Datenanalyse, die so in der Literatur bisher wenig bis keine Beachtung fanden

    Cooperation in self-organized heterogeneous swarms

    Get PDF
    Cooperation in self-organized heterogeneous swarms is a phenomenon from nature with many applications in autonomous robots. I specifically analyzed the problem of auto-regulated team formation in multi-agent systems and several strategies to learn socially how to make multi-objective decisions. To this end I proposed new multi-objective ranking relations and analyzed their properties theoretically and within multi-objective metaheuristics. The results showed that simple decision mechanism suffice to build effective teams of heterogeneous agents and that diversity in groups is not a problem but can increase the efficiency of multi-agent systems
    • 

    corecore