1,132 research outputs found

    Approximation and online algorithms in scheduling and coloring

    Get PDF
    In the last three decades, approximation and online algorithms have become a major area of theoretical computer science and discrete mathematics. Scheduling and coloring problems are among the most popular ones for which approximation and online algorithms have been analyzed. On one hand, motivated by the well-known difficulty to obtain good lower bounds for the problems, it is particularly hard to prove results on the online and offline performance of algorithms. On the other hand, the theoretically oriented studies of approximation and online algorithms for scheduling and coloring have also impact on the development of better algorithms for real world applications. In the thesis we present approximation algorithms and online algorithms for a number of scheduling and labeling (coloring) problems. Our work in the first part of the thesis is devoted to scheduling problems with the average weighted completion time objective function, that is primarily motivated by some theoretical questions which were open for a number of recent years. Here we present a general method which leads to the design of polynomial time approximation schemes (PTASs), best possible approximation results. In contrast, our work in the second part of the thesis is motivated by practical applications. We consider a number of new labeling and scheduling problems which occur in the design of communication networks. Here we present and analyze efficient approximation and online algorithms. We use very simple techniques which do not require large computational resources

    Top-k Route Search through Submodularity Modeling of Recurrent POI Features

    Full text link
    We consider a practical top-k route search problem: given a collection of points of interest (POIs) with rated features and traveling costs between POIs, a user wants to find k routes from a source to a destination and limited in a cost budget, that maximally match her needs on feature preferences. One challenge is dealing with the personalized diversity requirement where users have various trade-off between quantity (the number of POIs with a specified feature) and variety (the coverage of specified features). Another challenge is the large scale of the POI map and the great many alternative routes to search. We model the personalized diversity requirement by the whole class of submodular functions, and present an optimal solution to the top-k route search problem through indices for retrieving relevant POIs in both feature and route spaces and various strategies for pruning the search space using user preferences and constraints. We also present promising heuristic solutions and evaluate all the solutions on real life data.Comment: 11 pages, 7 figures, 2 table

    Improving the performance of Virtualized Network Services based on NFV and SDN

    Get PDF
    Network Functions Virtualisation (NFV) proposes to move all the traditional network appliances, which require dedicated physical machine, onto virtualised environment (e.g,. Virtual Machine). In this way, many of the current physical devices present in the infrastructure are replaced with standard high volume servers, which could be located in Datacenters, at the edge of the network and in the end user premises. This enables a reduction of the required physical resources thanks to the use of virtualization technologies, already used in cloud computing, and allows services to be more dynamic and scalable. However, differently from traditional cloud applications which are rather demanding in terms of CPU power, network applications are mostly I/O bound, hence the virtualization technologies in use (either standard VM-based or lightweight ones) need to be improved to maximize the network performance. A series of Virtual Network Functions (VNFs) can be connected to each other thanks to Software-Defined Networks (SDN) technologies (e.g., OpenFlow) to create a Network Function Forwarding Graph (NF-FG) that processes the network traffic in the configured order of the graph. Using NF-FGs it is possible to create arbitrary chains of services, and transparently configure different virtualized network services, which can be dynamically instantiated and rearranges depending on the requested service and its requirements. However, the above virtualized technologies are rather demanding in terms of hardware resources (mainly CPU and memory), which may have a non-negligible impact on the cost of providing the services according to this paradigm. This thesis will investigate this problem, proposing a set of solutions that enable the novel NFV paradigm to be efficiently used, hence being able to guarantee both flexibility and efficiency in future network services

    Doctor of Philosophy

    Get PDF
    dissertationInteractive editing and manipulation of digital media is a fundamental component in digital content creation. One media in particular, digital imagery, has seen a recent increase in popularity of its large or even massive image formats. Unfortunately, current systems and techniques are rarely concerned with scalability or usability with these large images. Moreover, processing massive (or even large) imagery is assumed to be an off-line, automatic process, although many problems associated with these datasets require human intervention for high quality results. This dissertation details how to design interactive image techniques that scale. In particular, massive imagery is typically constructed as a seamless mosaic of many smaller images. The focus of this work is the creation of new technologies to enable user interaction in the formation of these large mosaics. While an interactive system for all stages of the mosaic creation pipeline is a long-term research goal, this dissertation concentrates on the last phase of the mosaic creation pipeline - the composition of registered images into a seamless composite. The work detailed in this dissertation provides the technologies to fully realize interactive editing in mosaic composition on image collections ranging from the very small to massive in scale

    THE SCALABLE AND ACCOUNTABLE BINARY CODE SEARCH AND ITS APPLICATIONS

    Get PDF
    The past decade has been witnessing an explosion of various applications and devices. This big-data era challenges the existing security technologies: new analysis techniques should be scalable to handle “big data” scale codebase; They should be become smart and proactive by using the data to understand what the vulnerable points are and where they locate; effective protection will be provided for dissemination and analysis of the data involving sensitive information on an unprecedented scale. In this dissertation, I argue that the code search techniques can boost existing security analysis techniques (vulnerability identification and memory analysis) in terms of scalability and accuracy. In order to demonstrate its benefits, I address two issues of code search by using the code analysis: scalability and accountability. I further demonstrate the benefit of code search by applying it for the scalable vulnerability identification [57] and the cross-version memory analysis problems [55, 56]. Firstly, I address the scalability problem of code search by learning “higher-level” semantic features from code [57]. Instead of conducting fine-grained testing on a single device or program, it becomes much more crucial to achieve the quick vulnerability scanning in devices or programs at a “big data” scale. However, discovering vulnerabilities in “big code” is like finding a needle in the haystack, even when dealing with known vulnerabilities. This new challenge demands a scalable code search approach. To this end, I leverage successful techniques from the image search in computer vision community and propose a novel code encoding method for scalable vulnerability search in binary code. The evaluation results show that this approach can achieve comparable or even better accuracy and efficiency than the baseline techniques. Secondly, I tackle the accountability issues left in the vulnerability searching problem by designing vulnerability-oriented raw features [58]. The similar code does not always represent the similar vulnerability, so it requires that the feature engineering for the code search should focus on semantic level features rather than syntactic ones. I propose to extract conditional formulas as higher-level semantic features from the raw binary code to conduct the code search. A conditional formula explicitly captures two cardinal factors of a vulnerability: 1) erroneous data dependencies and 2) missing or invalid condition checks. As a result, the binary code search on conditional formulas produces significantly higher accuracy and provides meaningful evidence for human analysts to further examine the search results. The evaluation results show that this approach can further improve the search accuracy of existing bug search techniques with very reasonable performance overhead. Finally, I demonstrate the potential of the code search technique in the memory analysis field, and apply it to address their across-version issue in the memory forensic problem [55, 56]. The memory analysis techniques for COTS software usually rely on the so-called “data structure profiles” for their binaries. Construction of such profiles requires the expert knowledge about the internal working of a specified software version. However, it is still a cumbersome manual effort most of time. I propose to leverage the code search technique to enable a notion named “cross-version memory analysis”, which can update a profile for new versions of a software by transferring the knowledge from the model that has already been trained on its old version. The evaluation results show that the code search based approach advances the existing memory analysis methods by reducing the manual efforts while maintaining the reasonable accuracy. With the help of collaborators, I further developed two plugins to the Volatility memory forensic framework [2], and show that each of the two plugins can construct a localized profile to perform specified memory forensic tasks on the same memory dump, without the need of manual effort in creating the corresponding profile

    DeepFT: Fault-tolerant edge computing using a self-supervised deep surrogate model

    Get PDF
    The emergence of latency-critical AI applications has been supported by the evolution of the edge computing paradigm. However, edge solutions are typically resource-constrained, posing reliability challenges due to heightened contention for compute capacities and faulty application behavior in the presence of overload conditions. Although a large amount of generated log data can be mined for fault prediction, labeling this data for training is a manual process and thus a limiting factor for automation. Due to this, many companies resort to unsupervised fault-tolerance models. Yet, failure models of this kind can incur a loss of accuracy when they need to adapt to non-stationary workloads and diverse host characteristics. Thus, we propose a novel modeling approach, DeepFT, to proactively avoid system overloads and their adverse effects by optimizing the task scheduling decisions. DeepFT uses a deep-surrogate model to accurately predict and diagnose faults in the system and co-simulation based self-supervised learning to dynamically adapt the model in volatile settings. Experimentation on an edge cluster shows that DeepFT can outperform state-of-the-art methods in fault-detection and QoS metrics. Specifically, DeepFT gives the highest F1 scores for fault-detection, reducing service deadline violations by up to 37% while also improving response time by up to 9%

    Indexer++: Workload-Aware Online Index Tuning With Transformers and Reinforcement Learning

    Get PDF
    With the increasing workload complexity in modern databases, the manual process of index selection is a challenging task. There is a growing need for a database with an ability to learn and adapt to evolving workloads. This paper proposes Indexer++, an autonomous, workload-aware, online index tuner. Unlike existing approaches, Indexer++ imposes low overhead on the DBMS, is responsive to changes in query workloads and swiftly selects indexes. Our approach uses a combination of text analytic techniques and reinforcement learning. Indexer++ consist of two phases: Phase (i) learns workload trends using a novel trend detection technique based on a pre-trained transformer model. Phase (ii) performs online, i.e., continuous or while the DBMS is processing workloads, index selection using a novel online deep reinforcement learning technique using our proposed priority experience sweeping. This paper provides an experimental evaluation of Indexer++ in multiple scenarios using benchmark (TPC-H) and real-world datasets (IMDB). In our experiments, Indexer++ effectively identifies changes in workload trends and selects the set of optimal indexes
    • …
    corecore