10 research outputs found
Algorithmic Results for Clustering and Refined Physarum Analysis
In the first part of this thesis, we study the Binary -Rank- problem which given a binary matrix and a positive integer , seeks to find a rank- binary matrix minimizing the number of non-zero entries of . A central open question is whether this problem admits a polynomial time approximation scheme. We give an affirmative answer to this question by designing the first randomized almost-linear time approximation scheme for constant over the reals, , and the Boolean semiring. In addition, we give novel algorithms for important variants of -low rank approximation.
The second part of this dissertation, studies a popular and successful heuristic, known as Approximate Spectral Clustering (ASC), for partitioning the nodes of a graph into clusters with small conductance. We give a comprehensive analysis, showing that ASC runs efficiently and yields a good approximation of an optimal -way node partition of .
In the final part of this thesis, we present two results on slime mold computations: i) the continuous undirected Physarum dynamics converges for undirected linear programs with a non-negative cost vector; and ii) for the discrete directed Physarum dynamics, we give a refined analysis that yields strengthened and close to optimal convergence rate bounds, and shows that the model can be initialized with any strongly dominating point.Im ersten Teil dieser Arbeit untersuchen wir das Binary -Rank- Problem. Hier sind eine bin{\"a}re Matrix und eine positive ganze Zahl gegeben und gesucht wird eine bin{\"a}re Matrix mit Rang , welche die Anzahl von nicht null Eintr{\"a}gen in minimiert. Wir stellen das erste randomisierte, nahezu lineare Aproximationsschema vor konstantes {\"u}ber die reellen Zahlen, und den Booleschen Semiring. Zus{\"a}tzlich erzielen wir neue Algorithmen f{\"u}r wichtige Varianten der -low rank Approximation.
Der zweite Teil dieser Dissertation besch{\"a}ftigt sich mit einer beliebten und erfolgreichen Heuristik, die unter dem Namen Approximate Spectral Cluster (ASC) bekannt ist. ASC partitioniert die Knoten eines gegeben Graphen in Cluster kleiner Conductance. Wir geben eine umfassende Analyse von ASC, die zeigt, dass ASC eine effiziente Laufzeit besitzt und eine gute Approximation einer optimale -Weg-Knoten Partition f{\"u}r berechnet.
Im letzten Teil dieser Dissertation pr{\"a}sentieren wir zwei Ergebnisse {\"u}ber Berechnungen mit Hilfe von Schleimpilzen: i) die kontinuierliche ungerichtete Physarum Dynamik konvergiert f{\"u}r ungerichtete lineare Programme mit einem nicht negativen Kostenvektor; und ii) f{\"u}r die diskrete gerichtete Physikum Dynamik geben wir eine verfeinerte Analyse, die st{\"a}rkere und beinahe optimale Schranken f{\"u}r ihre Konvergenzraten liefert und zeigt, dass das Model mit einem beliebigen stark dominierender Punkt initialisiert werden kann
Recommended from our members
Analyzing, Mining, and Predicting Networked Behaviors
Network structure exists in various types of data in the real world, such as online and offline social networks, traffic networks, computer networks, brain networks, and countless other cases where there are relationships between different entities in the data. What are the roles of network structures in these data? First, the network captures inherent characteristics of the data themselves. This is clear from the definition of the network, which represents the relationship between entities: e.g., the social links among people in a social network describe how they interact with each other; a road network summarizes how the roads are laid out geographically; a brain network obtained from fMRI images represents pairs of brain regions that are active at the same time; a computer network constrains the paths via which internet packages and thus information or viruses can spread. Second, the network structures affect the evolution of the data over time. For example, new friendship links in an online social network are frequently created between friends of friends. Similarly, the current road network structure is without a doubt taken into consideration when roads are added or temporarily closed. As we grow, our brains also grow, including the additions of useful links or the clean up of unnecessary links between brain regions. Third, the network structures act as guidance for many different processes happening in the data. For instance, the links between users on social network dictate how gossips can spread; the roads influence how traffic flows in a city; the links between brain regions affects the way we think and how effectively we do things; the connections between computers route the transfer of any information on the internet.In this thesis, I studied the network effect in various networked behaviors, including analyzing such effect, finding its patterns, and predicting future networked behaviors. First, I gained insights into the data by analyzing the accompanied network structures as well as its evolution. Second, I proposed algorithms for mining different network patterns that help summarize the effect of the network structures on different networked behaviors. Finally, I proposed models to predict the evolution of networked behaviors over time. Toward these tasks, I explored a wide variety of network data, including protein-protein interaction networks, online social networks, collaboration networks, chemical compounds, and traffic networks. Overall, I tackled these network data in different aspects and developed a number of methods for effectively mining and forecasting networked behaviors in data
Statistical Analysis of Structured Latent Attribute Models
In modern psychological and biomedical research with diagnostic purposes, scientists often formulate the key task as inferring the fine-grained latent information under structural constraints. These structural constraints usually come from the domain experts' prior knowledge or insight. The emerging family of Structured Latent Attribute Models (SLAMs) accommodate these modeling needs and have received substantial attention in psychology, education, and epidemiology. SLAMs bring exciting opportunities and unique challenges. In particular, with high-dimensional discrete latent attributes and structural constraints encoded by a structural matrix, one needs to balance the gain in the model's explanatory power and interpretability, against the difficulty of understanding and handling the complex model structure. This dissertation studies such a family of structured latent attribute models from theoretical, methodological, and computational perspectives. On the theoretical front, we present identifiability results that advance the theoretical knowledge of how the structural matrix influences the estimability of SLAMs. The new identifiability conditions guide real-world practices of designing diagnostic tests and also lay the foundation for drawing valid statistical conclusions. On the methodology side, we propose a statistically consistent penalized likelihood approach to selecting significant latent patterns in the population in high dimensions. Computationally, we develop scalable algorithms to simultaneously recover both the structural matrix and the dependence structure of the latent attributes in ultrahigh dimensional scenarios. These developments explore an exponentially large model space involving many discrete latent variables, and they address the estimation and computation challenges of high-dimensional SLAMs arising from large-scale scientific measurements. The application of the proposed methodology to the data from international educational assessments reveals meaningful knowledge structures of the student population.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155196/1/yuqigu_1.pd