130 research outputs found

    Unique Perfect Phylogeny Characterizations via Uniquely Representable Chordal Graphs

    Full text link
    The perfect phylogeny problem is a classic problem in computational biology, where we seek an unrooted phylogeny that is compatible with a set of qualitative characters. Such a tree exists precisely when an intersection graph associated with the character set, called the partition intersection graph, can be triangulated using a restricted set of fill edges. Semple and Steel used the partition intersection graph to characterize when a character set has a unique perfect phylogeny. Bordewich, Huber, and Semple showed how to use the partition intersection graph to find a maximum compatible set of characters. In this paper, we build on these results, characterizing when a unique perfect phylogeny exists for a subset of partial characters. Our characterization is stated in terms of minimal triangulations of the partition intersection graph that are uniquely representable, also known as ur-chordal graphs. Our characterization is motivated by the structure of ur-chordal graphs, and the fact that the block structure of minimal triangulations is mirrored in the graph that has been triangulated

    Matchings, coverings, and Castelnuovo-Mumford regularity

    Full text link
    We show that the co-chordal cover number of a graph G gives an upper bound for the Castelnuovo-Mumford regularity of the associated edge ideal. Several known combinatorial upper bounds of regularity for edge ideals are then easy consequences of covering results from graph theory, and we derive new upper bounds by looking at additional covering results.Comment: 12 pages; v4 has minor changes for publicatio

    Minimal and minimal invariant Markov bases of decomposable models for contingency tables

    Get PDF
    We study Markov bases of decomposable graphical models consisting of primitive moves (i.e., square-free moves of degree two) by determining the structure of fibers of sample size two. We show that the number of elements of fibers of sample size two are powers of two and we characterize primitive moves in Markov bases in terms of connected components of induced subgraphs of the independence graph of a hierarchical model. This allows us to derive a complete description of minimal Markov bases and minimal invariant Markov bases for decomposable models.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ207 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Building Information Filtering Networks with Topological Constraints: Algorithms and Applications

    Get PDF
    We propose a new methodology for learning the structure of sparse networks from data; in doing so we adopt a dual perspective where we consider networks both as weighted graphs and as simplicial complexes. The proposed learning methodology belongs to the family of preferential attachment algorithms, where a network is extended by iteratively adding new vertices. In the conventional preferential attachment algorithm a new vertex is added to the network by adding a single edge to another existing vertex; in our approach a new vertex is added to a set of vertices by adding one or more new simplices to the simplicial complex. We propose the use of a score function to quantify the strength of the association between the new vertex and the attachment points. The methodology performs a greedy optimisation of the total score by selecting, at each step, the new vertex and the attachment points that maximise the gain in the score. Sparsity is enforced by restricting the space of the feasible configurations through the imposition of topological constraints on the candidate networks; the constraint is fulfilled by allowing only topological operations that are invariant with respect to the required property. For instance, if the topological constraint requires the constructed network to be be planar, then only planarity-invariant operations are allowed; if the constraint is that the network must be a clique forest, then only simplicial vertices can be added. At each step of the algorithm, the vertex to be added and the attachment points are those that provide the maximum increase in score while maintaining the topological constraints. As a concrete but general realisation we propose the clique forest as a possible topological structure for the representation of sparse networks, and we allow to specify further constraints such as the allowed range of clique sizes and the saturation of the attachment points. In this thesis we originally introduce the Maximally Filtered Clique Forest (MFCF) algorithm: the MFCF builds a clique forest by repeated application of a suitably invariant operation that we call Clique Expansion operator and adds vertices according to a strategy that greedily maximises the gain in a local score function. The gains produced by the Clique Expansion operator can be validated in a number of ways, including statistical testing, cross-validation or value thresholding. The algorithm does not prescribe a specific form for the gain function, but allows the use of any number of gain functions as long as they are consistent with the Clique Expansion operator. We describe several examples of gain functions suited to different problems. As a specific practical realisation we study the extraction of planar networks with the Triangulated Maximally Filtered Graph (TMFG). The TMFG, in its simplest form, is a specialised version of the MFCF, but it can be made more powerful by allowing the use of specialised planarity invariant operators that are not based on the Clique Expansion operator. We provide applications to two well known applied problems: the Maximum Weight Planar Subgraph Problem (MWPSP) and the Covariance Selection problem. With regards to the Covariance Selection problem we compare our results to the state of the art solution (the Graphical Lasso) and we highlight the benefits of our methodology. Finally, we study the geometry of clique trees as simplicial complexes and note how the statistics based on cliques and separators provides information equivalent to the one that can be achieved by means of homological methods, such as the analysis of Betti numbers, however with our approach being computationally more efficient and intuitively simpler. Finally, we use the geometric tools developed to provide a possible methodology for inferring the size of a dataset generated by a factor model. As an example we show that our tools provide a solution for inferring the size of a dataset generated by a factor model

    Finding Optimal Tree Decompositions

    Get PDF
    The task of organizing a given graph into a structure called a tree decomposition is relevant in multiple areas of computer science. In particular, many NP-hard problems can be solved in polynomial time if a suitable tree decomposition of a graph describing the problem instance is given as a part of the input. This motivates the task of finding as good tree decompositions as possible, or ideally, optimal tree decompositions. This thesis is about finding optimal tree decompositions of graphs with respect to several notions of optimality. Each of the considered notions measures the quality of a tree decomposition in the context of an application. In particular, we consider a total of seven problems that are formulated as finding optimal tree decompositions: treewidth, minimum fill-in, generalized and fractional hypertreewidth, total table size, phylogenetic character compatibility, and treelength. For each of these problems we consider the BT algorithm of Bouchitté and Todinca as the method of finding optimal tree decompositions. The BT algorithm is well-known on the theoretical side, but to our knowledge the first time it was implemented was only recently for the 2nd Parameterized Algorithms and Computational Experiments Challenge (PACE 2017). The author’s implementation of the BT algorithm took the second place in the minimum fill-in track of PACE 2017. In this thesis we review and extend the BT algorithm and our implementation. In particular, we improve the eciency of the algorithm in terms of both theory and practice. We also implement the algorithm for each of the seven problems considered, introducing a novel adaptation of the algorithm for the maximum compatibility problem of phylogenetic characters. Our implementation outperforms alternative state-of-the-art approaches in terms of numbers of test instances solved on well-known benchmarks on minimum fill-in, generalized hypertreewidth, fractional hypertreewidth, total table size, and the maximum compatibility problem of phylogenetic characters. Furthermore, to our understanding the implementation is the first exact approach for the treelength problem

    Upper clique transversals in graphs

    Full text link
    A clique transversal in a graph is a set of vertices intersecting all maximal cliques. The problem of determining the minimum size of a clique transversal has received considerable attention in the literature. In this paper, we initiate the study of the "upper" variant of this parameter, the upper clique transversal number, defined as the maximum size of a minimal clique transversal. We investigate this parameter from the algorithmic and complexity points of view, with a focus on various graph classes. We show that the corresponding decision problem is NP-complete in the classes of chordal graphs, chordal bipartite graphs, and line graphs of bipartite graphs, but solvable in linear time in the classes of split graphs and proper interval graphs.Comment: Full version of a WG 2023 pape

    Graph Algorithms and Applications

    Get PDF
    The mixture of data in real-life exhibits structure or connection property in nature. Typical data include biological data, communication network data, image data, etc. Graphs provide a natural way to represent and analyze these types of data and their relationships. Unfortunately, the related algorithms usually suffer from high computational complexity, since some of these problems are NP-hard. Therefore, in recent years, many graph models and optimization algorithms have been proposed to achieve a better balance between efficacy and efficiency. This book contains some papers reporting recent achievements regarding graph models, algorithms, and applications to problems in the real world, with some focus on optimization and computational complexity

    Scheduling algorithms for throughput maximization in data networks

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 215-226).This thesis considers the performance implications of throughput optimal scheduling in physically and computationally constrained data networks. We study optical networks, packet switches, and wireless networks, each of which has an assortment of features and constraints that challenge the design decisions of network architects. In this work, each of these network settings are subsumed under a canonical model and scheduling framework. Tools of queueing analysis are used to evaluate network throughput properties, and demonstrate throughput optimality of scheduling and routing algorithms under stochastic traffic. Techniques of graph theory are used to study network topologies having desirable throughput properties. Combinatorial algorithms are proposed for efficient resource allocation. In the optical network setting, the key enabling technology is wavelength division multiplexing (WDM), which allows each optical fiber link to simultaneously carry a large number of independent data streams at high rate. To take advantage of this high data processing potential, engineers and physicists have developed numerous technologies, including wavelength converters, optical switches, and tunable transceivers.(cont.) While the functionality provided by these devices is of great importance in capitalizing upon the WDM resources, a major challenge exists in determining how to configure these devices to operate efficiently under time-varying data traffic. In the WDM setting, we make two main contributions. First, we develop throughput optimal joint WDM reconfiguration and electronic-layer routing algorithms, based on maxweight scheduling. To mitigate the service disruption associated with WDM reconfiguration, our algorithms make decisions at frame intervals. Second, we develop analytic tools to quantify the maximum throughput achievable in general network settings. Our approach is to characterize several geometric features of the maximum region of arrival rates that can be supported in the network. In the packet switch setting, we observe through numerical simulation the attractive throughput properties of a simple maximal weight scheduler. Subsequently, we consider small switches, and analytically demonstrate the attractive throughput properties achievable using maximal weight scheduling. We demonstrate that such throughput properties may not be sustained in larger switches.(cont.) In the wireless network setting, mesh networking is a promising technology for achieving connectivity in local and metropolitan area networks. Wireless access points and base stations adhering to the IEEE 802.11 wireless networking standard can be bought off the shelf at little cost, and can be configured to access the Internet in minutes. With ubiquitous low-cost Internet access perceived to be of tremendous societal value, such technology is naturally garnering strong interest. Enabling such wireless technology is thus of great importance. An important challenge in enabling mesh networks, and many other wireless network applications, results from the fact that wireless transmission is achieved by broadcasting signals through the air, which has the potential for interfering with other parts of the network. Furthermore, the scarcity of wireless transmission resources implies that link activation and packet routing should be effected using simple distributed algorithms. We make three main contributions in the wireless setting. First, we determine graph classes under which simple, distributed, maximal weight schedulers achieve throughput optimality.(cont.) Second, we use this acquired knowledge of graph classes to develop combinatorial algorithms, based on matroids, for allocating channels to wireless links, such that each channel can achieve maximum throughput using simple distributed schedulers. Third, we determine new conditions under which distributed algorithms for joint link activation and routing achieve throughput optimality.by Andrew Brzezinski.Ph.D
    • …
    corecore