47 research outputs found

    Efficient parameterized algorithms on structured graphs

    Get PDF
    In der klassischen Komplexitätstheorie werden worst-case Laufzeiten von Algorithmen typischerweise einzig abhängig von der Eingabegröße angegeben. In dem Kontext der parametrisierten Komplexitätstheorie versucht man die Analyse der Laufzeit dahingehend zu verfeinern, dass man zusätzlich zu der Eingabengröße noch einen Parameter berücksichtigt, welcher angibt, wie strukturiert die Eingabe bezüglich einer gewissen Eigenschaft ist. Ein parametrisierter Algorithmus nutzt dann diese beschriebene Struktur aus und erreicht so eine Laufzeit, welche schneller ist als die eines besten unparametrisierten Algorithmus, falls der Parameter klein ist. Der erste Hauptteil dieser Arbeit führt die Forschung in diese Richtung weiter aus und untersucht den Einfluss von verschieden Parametern auf die Laufzeit von bekannten effizient lösbaren Problemen. Einige vorgestellte Algorithmen sind dabei adaptive Algorithmen, was bedeutet, dass die Laufzeit von diesen Algorithmen mit der Laufzeit des besten unparametrisierten Algorithm für den größtmöglichen Parameterwert übereinstimmt und damit theoretisch niemals schlechter als die besten unparametrisierten Algorithmen und übertreffen diese bereits für leicht nichttriviale Parameterwerte. Motiviert durch den allgemeinen Erfolg und der Vielzahl solcher parametrisierten Algorithmen, welche eine vielzahl verschiedener Strukturen ausnutzen, untersuchen wir im zweiten Hauptteil dieser Arbeit, wie man solche unterschiedliche homogene Strukturen zu mehr heterogenen Strukturen vereinen kann. Ausgehend von algebraischen Ausdrücken, welche benutzt werden können, um von Parametern beschriebene Strukturen zu definieren, charakterisieren wir klar und robust heterogene Strukturen und zeigen exemplarisch, wie sich die Parameter tree-depth und modular-width heterogen verbinden lassen. Wir beschreiben dazu effiziente Algorithmen auf heterogenen Strukturen mit Laufzeiten, welche im Spezialfall mit den homogenen Algorithmen übereinstimmen.In classical complexity theory, the worst-case running times of algorithms depend solely on the size of the input. In parameterized complexity the goal is to refine the analysis of the running time of an algorithm by additionally considering a parameter that measures some kind of structure in the input. A parameterized algorithm then utilizes the structure described by the parameter and achieves a running time that is faster than the best general (unparameterized) algorithm for instances of low parameter value. In the first part of this thesis, we carry forward in this direction and investigate the influence of several parameters on the running times of well-known tractable problems. Several presented algorithms are adaptive algorithms, meaning that they match the running time of a best unparameterized algorithm for worst-case parameter values. Thus, an adaptive parameterized algorithm is asymptotically never worse than the best unparameterized algorithm, while it outperforms the best general algorithm already for slightly non-trivial parameter values. As illustrated in the first part of this thesis, for many problems there exist efficient parameterized algorithms regarding multiple parameters, each describing a different kind of structure. In the second part of this thesis, we explore how to combine such homogeneous structures to more general and heterogeneous structures. Using algebraic expressions, we define new combined graph classes of heterogeneous structure in a clean and robust way, and we showcase this for the heterogeneous merge of the parameters tree-depth and modular-width, by presenting parameterized algorithms on such heterogeneous graph classes and getting running times that match the homogeneous cases throughout

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    The Geometry of Tree-Based Sorting

    Get PDF
    We study the connections between sorting and the binary search tree (BST) model, with an aim towards showing that the fields are connected more deeply than is currently appreciated. While any BST can be used to sort by inserting the keys one-by-one, this is a very limited relationship and importantly says nothing about parallel sorting. We show what we believe to be the first formal relationship between the BST model and sorting. Namely, we show that a large class of sorting algorithms, which includes mergesort, quicksort, insertion sort, and almost every instance-optimal sorting algorithm, are equivalent in cost to offline BST algorithms. Our main theoretical tool is the geometric interpretation of the BST model introduced by Demaine et al. [Demaine et al., 2009], which finds an equivalence between searches on a BST and point sets in the plane satisfying a certain property. To give an example of the utility of our approach, we introduce the log-interleave bound, a measure of the information-theoretic complexity of a permutation ?, which is within a lg lg n multiplicative factor of a known lower bound in the BST model; we also devise a parallel sorting algorithm with polylogarithmic span that sorts a permutation ? using comparisons proportional to its log-interleave bound. Our aforementioned result on sorting and offline BST algorithms can be used to show existence of an offline BST algorithm whose cost is within a constant factor of the log-interleave bound of any permutation ?

    Addressing caveats of neural persistence with deep graph persistence

    Full text link
    Neural Persistence is a prominent measure for quantifying neural network complexity, proposed in the emerging field of topological data analysis in deep learning. In this work, however, we find both theoretically and empirically that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. Whilst this captures useful information for linear classifiers, we find that no relevant spatial structure is present in later layers of deep neural networks, making neural persistence roughly equivalent to the variance of weights. Additionally, the proposed averaging procedure across layers for deep neural networks does not consider interaction between layers. Based on our analysis, we propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers, which is equivalent to calculating neural persistence on one particular matrix. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues through standardisation. Code is available at https://github.com/ExplainableML/Deep-Graph-Persistence

    An Empirical Evaluation of Columnar Storage Formats

    Full text link
    Columnar storage is one of the core components of a modern data analytics system. Although many database management systems (DBMSs) have proprietary storage formats, most provide extensive support to open-source storage formats such as Parquet and ORC to facilitate cross-platform data sharing. But these formats were developed over a decade ago, in the early 2010s, for the Hadoop ecosystem. Since then, both the hardware and workload landscapes have changed significantly. In this paper, we revisit the most widely adopted open-source columnar storage formats (Parquet and ORC) with a deep dive into their internals. We designed a benchmark to stress-test the formats' performance and space efficiency under different workload configurations. From our comprehensive evaluation of Parquet and ORC, we identify design decisions advantageous with modern hardware and real-world data distributions. These include using dictionary encoding by default, favoring decoding speed over compression ratio for integer encoding algorithms, making block compression optional, and embedding finer-grained auxiliary data structures. Our analysis identifies important considerations that may guide future formats to better fit modern technology trends

    Combinatorial algorithms in the approximate computing paradigm

    Get PDF
    Data-intensive computing has led to the emergence of data centers with massive processor counts and main memory sizes. However, the demand for shared resources has surpassed their capacity, resulting in increased costs and limited access. Commodity hardware, although accessible, has limited computational resources. This poses a challenge when performing computationally intensive tasks with large amounts of data on systems with restricted memory. To address these issues, Approximate Computing offers a solution by allowing selective solution approximation, leading to improved resource efficiency. This dissertation focuses on the trade-off between output quality and computational resource usage in sorting and searching problems. It introduces the concept of Approximate Sorting, which aims to reduce resource usage while maintaining an accepted level of sorting quality. Quality metrics are defined to assess the ”sortedness” of approximately sorted arrays. The dissertation also proposes a general framework for incorporating approximate computing into sorting algorithms, presenting an algorithm for approximate sorting with guaranteed upper bounds. The algorithms operate under a constraint on the number of comparisons performed. The dissertation continues to explore searching algorithms, specifically binary search algorithms on approximately sorted arrays. It addresses cases where metrics are given for the input array and cases where metrics are not available. Efficient and optimal algorithms are developed for multidimensional range searches and catalog searches on approximately sorted input. The dissertation further proposes algorithms that analyze patterns in input order to optimize sorting. These algorithms identify underlying patterns and sequences, facilitating faster sorting approaches. Additionally, the dissertation discusses the growing popularity of approximate computing in the field of High-Performance Computing (HPC). It presents a novel approach to comparison-based sorting by incorporating parallel approximate computing. The dissertation also proposes algorithms for various queries on approximately sorted arrays, such as determining the rank or position of an element. The time complexity of these querying algorithms is proportional to the input metric. The dissertation concludes by emphasizing the wide range of applications for sorting and searching algorithms. In the context of packet classification in router buffers, approximate sorting offers advantages by reducing the time-consuming sorting step. By capping the number of comparisons, approximate sorting becomes a practical solution for efficiently handling the large volume of incoming packets. This dissertation contributes to the field of approximate computing by addressing resource limitations and cost issues in data-intensive computing. It provides insights into approximate sorting and searching algorithms, and their application in various domains, offering a valuable contribution to the advancement of efficient, scalable, and accessible data processing

    Intelligent Systems

    Get PDF
    This book is dedicated to intelligent systems of broad-spectrum application, such as personal and social biosafety or use of intelligent sensory micro-nanosystems such as "e-nose", "e-tongue" and "e-eye". In addition to that, effective acquiring information, knowledge management and improved knowledge transfer in any media, as well as modeling its information content using meta-and hyper heuristics and semantic reasoning all benefit from the systems covered in this book. Intelligent systems can also be applied in education and generating the intelligent distributed eLearning architecture, as well as in a large number of technical fields, such as industrial design, manufacturing and utilization, e.g., in precision agriculture, cartography, electric power distribution systems, intelligent building management systems, drilling operations etc. Furthermore, decision making using fuzzy logic models, computational recognition of comprehension uncertainty and the joint synthesis of goals and means of intelligent behavior biosystems, as well as diagnostic and human support in the healthcare environment have also been made easier

    The Log-Interleave Bound: Towards the Unification of Sorting and the BST Model

    Full text link
    We study the connections between sorting and the binary search tree model, with an aim towards showing that the fields are connected more deeply than is currently known. The main vehicle of our study is the log-interleave bound, a measure of the information-theoretic complexity of a permutation π\pi. When viewed through the lens of adaptive sorting -- the study of lists which are nearly sorted according to some measure of disorder -- the log-interleave bound is comparable to the most powerful known measure of disorder. Many of these measures of disorder are themselves virtually identical to well-known upper bounds in the BST model, such as the working set bound or the dynamic finger bound, suggesting a connection between BSTs and sorting. We present three results about the log-interleave bound which solidify the aforementioned connections. The first is a proof that the log-interleave bound is always within a lglgn\lg \lg n multiplicative factor of a known lower bound in the BST model, meaning that an online BST algorithm matching the log-interleave bound would perform within the same bounds as the state-of-the-art lglgn\lg \lg n-competitive BST. The second result is an offline algorithm in the BST model which uses O(LIB(π))O(\text{LIB}(\pi)) accesses to search for any permutation π\pi. The technique used to design this algorithm also serves as a general way to show whether a sorting algorithm can be transformed into an offline BST algorithm. The final result is a mergesort algorithm which performs work within the log-interleave bound of a permutation π\pi. This mergesort also happens to be highly parallel, adding to a line of work in parallel BST operations
    corecore