11 research outputs found
Energy saving in fixed wireless broadband networks
International audienceIn this paper, we present a mathematical formulation for saving energy in fixed broadband wireless networks by selectively turning off idle communication devices in low-demand scenarios. This problem relies on a fixed-charge capacitated network design (FCCND), which is very hard to optimize. We then propose heuristic algorithms to produce feasible solutions in a short time.Dans cet article, nous proposons une modélisation en programme linéaire en nombres entiers pour le problème de minimiser la consommation d'énergie dans les réseaux de collecte à faisceaux hertziens en éteignant une partie des équipements lorsque le trafic est bas. Ce problème repose sur un problème de dimensionnement de réseaux dont les arcs ont une capacité fixe, qui est très difficile à résoudre. Nous proposons un algorithme heuristique fournissant rapidement des solutions réalisables
Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis
Database theory and database practice are typically the domain of computer
scientists who adopt what may be termed an algorithmic perspective on their
data. This perspective is very different than the more statistical perspective
adopted by statisticians, scientific computers, machine learners, and other who
work on what may be broadly termed statistical data analysis. In this article,
I will address fundamental aspects of this algorithmic-statistical disconnect,
with an eye to bridging the gap between these two very different approaches. A
concept that lies at the heart of this disconnect is that of statistical
regularization, a notion that has to do with how robust is the output of an
algorithm to the noise properties of the input data. Although it is nearly
completely absent from computer science, which historically has taken the input
data as given and modeled algorithms discretely, regularization in one form or
another is central to nearly every application domain that applies algorithms
to noisy data. By using several case studies, I will illustrate, both
theoretically and empirically, the nonobvious fact that approximate
computation, in and of itself, can implicitly lead to statistical
regularization. This and other recent work suggests that, by exploiting in a
more principled way the statistical properties implicit in worst-case
algorithms, one can in many cases satisfy the bicriteria of having algorithms
that are scalable to very large-scale databases and that also have good
inferential or predictive properties.Comment: To appear in the Proceedings of the 2012 ACM Symposium on Principles
of Database Systems (PODS 2012
Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection
Hierarchical Clustering is an unsupervised data analysis method which has
been widely used for decades. Despite its popularity, it had an underdeveloped
analytical foundation and to address this, Dasgupta recently introduced an
optimization viewpoint of hierarchical clustering with pairwise similarity
information that spurred a line of work shedding light on old algorithms (e.g.,
Average-Linkage), but also designing new algorithms. Here, for the maximization
dual of Dasgupta's objective (introduced by Moseley-Wang), we present
polynomial-time .4246 approximation algorithms that use Max-Uncut Bisection as
a subroutine. The previous best worst-case approximation factor in polynomial
time was .336, improving only slightly over Average-Linkage which achieves 1/3.
Finally, we complement our positive results by providing APX-hardness (even for
0-1 similarities), under the Small Set Expansion hypothesis
Optimisation de la consommation énergétique dans les réseaux sans fil fixes
International audienceNous étudions le problème d'optimisation énergétique dans les réseaux sans fil fixes dans le cas d'une faible demande de trafic par rapport à la capacité du réseau. Nous proposons un programme linéaire pour résoudre le problème, puis nous présentons une heuristique permettant de trouver rapidement une bonne solution
A modified multilevel k-way partitioning algorithm for trip-based road networks
In today’s world, the traffic volume on urban road networks is multiplying rapidly due to the heavy usage of vehicles and mobility on demand services. Migration of people towards urban areas result in increasing size and complexity of urban road networks. When handling such complex traffic systems, partitioning the road network into multiple sub-regions and managing the identified sub regions is a popular approach.
In this paper, we propose an algorithm to identify sub-regions of a road network that exhibit homogeneous traffic flow patterns. In a stage wise manner, we model the road network graph by using taxi-trip data obtained on the selected region. Then, we apply the proposed modified multilevel kway partitioning algorithm to obtain optimal number of partitions from the developed road graph. An interesting feature of this algorithm is, resulting partitions are geographically connected and consists minimal interpartition trip flow. Our results show that the proposed algorithm outperforms state-of-the-art multilevel partitioning algorithms for tripbased road networks. By this research, we demonstrate the ability of road network partitioning using trip data while preserving the partition homogeneity and connectivity
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
Escape times for subgraph detection and graph partitioning
We provide a rearrangement based algorithm for fast detection of subgraphs of
vertices with long escape times for directed or undirected networks.
Complementing other notions of densest subgraphs and graph cuts, our method is
based on the mean hitting time required for a random walker to leave a
designated set and hit the complement. We provide a new relaxation of this
notion of hitting time on a given subgraph and use that relaxation to construct
a fast subgraph detection algorithm and a generalization to -partitioning
schemes. Using a modification of the subgraph detector on each component, we
propose a graph partitioner that identifies regions where random walks live for
comparably large times. Importantly, our method implicitly respects the
directed nature of the data for directed graphs while also being applicable to
undirected graphs. We apply the partitioning method for community detection to
a large class of model and real-world data sets.Comment: 22 pages, 10 figures, 1 table, comments welcome!
Programação matemática e imersões métricas para aproximações em problemas de corte
Orientador : Prof. Dr. AndrĂ© Luiz Pires GuedesDissertação (mestrado) - Universidade Federal do Paraná, Setor de CiĂŞncias Exatas, Programa de PĂłs-Graduação em Informática. Defesa: Curitiba, 05/05/2014Inclui referĂŞnciasResumo: Os algoritmos de aproximação sĂŁo capazes de gerar soluções prĂłximas da Ăłtima demandando tempo de execução polinomial. Vários dos algoritmos de aproximação existentes na literatura solucionam problemas com base em valores numĂ©ricos obtidos em soluções de programas matemáticos, como programas lineares e programas vetoriais. Os problemas do corte máximo, do corte multisseparador mĂnimo e do corte mais disperso podem ser modelados com programas matemáticos inteiros que, se solucionados, resultam em soluções átimas. Porem, solucionar programas matemáticos inteiros demanda tempo exponencial exigindo, assim, que sejam criados novos programas matemáticos por meio de relaxações. Uma relaxação admite valores contĂnuos de solução, o que torna possĂvel encontrar a solução Ăłtima do programa relaxado em tempo polinomial. Em contrapartida, para que sejam encontradas soluções prĂłximas da Ăłtima, faz-se necessárias tomadas de decisĂŁo baseadas em análises sobre as soluções dos programas relaxados. Os trĂŞs problemas apresentados sĂŁo modelados por programas matemáticos relaxados e solucionados com algoritmos desenvolvidos com base em análises geomĂ©tricas sobre as soluções desses programas. As soluções dos programas lineares relaxados que modelam os problemas do corte multisseparador mĂnimo e do corte mais disperso sĂŁo processadas e sĂŁo visualizadas como pontos cujas distâncias sĂŁo calculadas pela norma 1. Sub-rotinas que imergem espaços mĂ©tricos finitos gerais em espaços mĂ©tricos definidos pela norma l1 se mostraram importantes no desenvolvimento de algoritmos de aproximação. Para solucionar o problema do corte mais disperso, sĂŁo utilizados algoritmos de imersĂŁo que possuem uma certa taxa de distorção das distâncias, sendo essa taxa o principal fator para a determinação da garantia de aproximação do algoritmo como um todo.Abstract: Approximation algorithms can generate near optimal solutions requiring execution in polynomial time. Many approximation algorithms in the literature follow the idea of searching for numerical values from mathematical programming formulations, usually linear or vector programming, of the problem in consideration. The problems of maximum cut, multiway cut and sparsest cut can be modeled with integer mathematical programs that result in optimal solutions. However, solving an integer mathematical program is a NP-hard problem, so an usual approach is the creation of new mathematical programs through relaxations. A relaxation admits continuous solution, which makes possible to find the relaxed program optimal solution in polynomial time. However, in order to find near optimal solutions one should perform an analyses of this solution obtained by relaxed programs. The three problems presented here are modeled by relaxed mathematical programs and solved by algorithms based on geometric analyses on the solutions of these programs. The solutions of the relaxed linear programs that model the problems of the multiway cut and the sparsest cut are seen as points whose distances are calculated by the l1 norm. Subroutines that embed general finite metric spaces in metric spaces defined by the l1 norm proved to be important in approximation algorithms. In order to solve the sparsest cut problem, the approach is using embedding algorithms that have a certain distortion rate of the distances, being this rate the main factor determining the performance guarantee of the algorithm
Contributions to Robust Graph Clustering: Spectral Analysis and Algorithms
This dissertation details the design of fast, and parameter free, graph clustering methods to robustly determine set cluster assignments. It provides spectral analysis as well as algorithms that adapt the obtained theoretical results to the implementation of robust graph clustering techniques. Sparsity is of importance in graph clustering and a first contribution of the thesis is the definition of a sparse graph model consistent with the graph clustering objectives. This model is based on an advantageous property, arising from a block diagonal representation, of a matrix that promotes the density of connections within clusters and sparsity between them. Spectral analysis of the sparse
graph model including the eigen-decomposition of the Laplacian matrix is conducted. The analysis of the Laplacian matrix is simplified by defining a vector that carries all the relevant information that is contained in the Laplacian matrix. The obtained spectral properties of sparse graphs are adapted to sparsity-aware clustering based on two methods that formulate the determination of the sparsity level as approximations to spectral properties of the sparse graph models.
A second contribution of this thesis is to analyze the effects of outliers on graph clustering and to propose algorithms that address robustness and the level of sparsity jointly. The basis for this contribution is to specify fundamental outlier types that occur in the cases of extreme sparsity and the mathematical analysis of their effects on sparse graphs to develop graph clustering algorithms that are robust against the investigated outlier effects. Based on the obtained results, two different robust and sparsity-aware affinity matrix construction methods are proposed. Motivated by the outliers’ effects on eigenvectors, a robust Fiedler vector estimation and a robust spectral clustering methods are proposed. Finally, an outlier detection algorithm that is built upon the vertex degree is proposed and applied to gait analysis.
The results of this thesis demonstrate the importance of jointly addressing robustness and the level of sparsity for graph clustering algorithms. Additionally, simplified Laplacian matrix analysis provides promising results to design graph construction methods that may be computed efficiently through the optimization in a vector space instead of the usually used matrix space