192 research outputs found

    Polynomial algorithms for partitioning a tree into single-center subtrees to minimize flat service costs

    Get PDF
    This paper deals with the following graph partitioning problem. Consider a connected graph with n nodes, p of which are centers, while the remaining ones are units. For each unit-center pair there is a fixed service cost and the goal is to find a partition into connected components such that each component contains only one center and the total service cost is minimum. This problem is known to be NP-hard on general graphs, and here we show that it remains such even if the service cost is monotone and the graph is bipartite. However, in this paper we derive some polynomial time algorithms for trees. For this class of graphs we provide several reformulations of the problem as integer linear programs proving the integrality of the corresponding polyhedra. As a consequence, the tree partitioning problem can be solved in polynomial time either by linear programming or by suitable convex nondifferentiable optimization algorithms. Moreover, we develop a dynamic programming algorithm, whose recursion is based on sequences of minimum weight closure problems, which solves the problem on trees in O(np) time

    Performance Improvement of Distributed Computing Framework and Scientific Big Data Analysis

    Get PDF
    Analysis of Big data to gain better insights has been the focus of researchers in the recent past. Traditional desktop computers or database management systems may not be suitable for efficient and timely analysis, due to the requirement of massive parallel processing. Distributed computing frameworks are being explored as a viable solution. For example, Google proposed MapReduce, which is becoming a de facto computing architecture for Big data solutions. However, scheduling in MapReduce is coarse grained and remains as a challenge for improvement. Related with MapReduce scheduler when configured over distributed clusters, we identify two issues: data locality disruption and random assignment of non-local map tasks. We propose a network aware scheduler to extend the existing rack awareness. The tasks are scheduled in the order of node, rack and any other rack within the same cluster to achieve cluster level data locality. The issue of random assignment non-local map tasks is handled by enhancing the scheduler to consider the network parameters, such as delay, bandwidth and packet loss between remote clusters. As part of Big data analysis at computational biology, we consider two major data intensive applications: indexing genome sequences and de Novo assembly. Both of these applications deal with the massive amount data generated from DNA sequencers. We developed a scalable algorithm to construct sub-trees of a suffix tree in parallel to address huge memory requirements needed for indexing the human genome. For the de Novo assembly, we propose Parallel Giraph based Assembler (PGA) to address the challenges associated with the assembly of large genomes over commodity hardware. PGA uses the de Bruijn graph to represent the data generated from sequencers. Huge memory demands and performance expectations are addressed by developing parallel algorithms based on the distributed graph-processing framework, Apache Giraph

    LIPIcs, Volume 248, ISAAC 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 248, ISAAC 2022, Complete Volum

    Doctor of Philosophy

    Get PDF
    dissertationThis dissertation explores three key facets of software algorithms for custom hardware ray tracing: primitive intersection, shading, and acceleration structure construction. For the first, primitive intersection, we show how nearly all of the existing direct three-dimensional (3D) ray-triangle intersection tests are mathematically equivalent. Based on this, a genetic algorithm can automatically tune a ray-triangle intersection test for maximum speed on a particular architecture. We also analyze the components of the intersection test to determine how much floating point precision is required and design a numerically robust intersection algorithm. Next, for shading, we deconstruct Perlin noise into its basic parts and show how these can be modified to produce a gradient noise algorithm that improves the visual appearance. This improved algorithm serves as the basis for a hardware noise unit. Lastly, we show how an existing bounding volume hierarchy can be postprocessed using tree rotations to further reduce the expected cost to traverse a ray through it. This postprocessing also serves as the basis for an efficient update algorithm for animated geometry. Together, these contributions should improve the efficiency of both software- and hardware-based ray tracers

    A Polyhedral Study of Mixed 0-1 Set

    Get PDF
    We consider a variant of the well-known single node fixed charge network flow set with constant capacities. This set arises from the relaxation of more general mixed integer sets such as lot-sizing problems with multiple suppliers. We provide a complete polyhedral characterization of the convex hull of the given set

    Performance optimization of wireless sensor networks for remote monitoring

    Get PDF
    Wireless sensor networks (WSNs) have gained worldwide attention in recent years because of their great potential for a variety of applications such as hazardous environment exploration, military surveillance, habitat monitoring, seismic sensing, and so on. In this thesis we study the use of WSNs for remote monitoring, where a wireless sensor network is deployed in a remote region for sensing phenomena of interest while its data monitoring center is located in a metropolitan area that is geographically distant from the monitored region. This application scenario poses great challenges since such kind of monitoring is typically large scale and expected to be operational for a prolonged period without human involvement. Also, the long distance between the monitored region and the data monitoring center requires that the sensed data must be transferred by the employment of a third-party communication service, which incurs service costs. Existing methodologies for performance optimization of WSNs base on that both the sensor network and its data monitoring center are co-located, and therefore are no longer applicable to the remote monitoring scenario. Thus, developing new techniques and approaches for severely resource-constrained WSNs is desperately needed to maintain sustainable, unattended remote monitoring with low cost. Specifically, this thesis addresses the key issues and tackles problems in the deployment of WSNs for remote monitoring from the following aspects. To maximize the lifetime of large-scale monitoring, we deal with the energy consumption imbalance issue by exploring multiple sinks. We develop scalable algorithms which determine the optimal number of sinks needed and their locations, thereby dynamically identifying the energy bottlenecks and balancing the data relay workload throughout the network. We conduct experiments and the experimental results demonstrate that the proposed algorithms significantly prolong the network lifetime. To eliminate imbalance of energy consumption among sensor nodes, a complementary strategy is to introduce a mobile sink for data gathering. However, the limited communication time between the mobile sink and nodes results in that only part of sensed data will be collected and the rest will be lost, for which we propose the concept of monitoring quality with the exploration of sensed data correlation among nodes. We devise a heuristic for monitoring quality maximization, which schedules the sink to collect data from selected nodes, and uses the collected data to recover the missing ones. We study the performance of the proposed heuristic and validate its effectiveness in improving the monitoring quality. To strive for the fine trade-off between two performance metrics: throughput and cost, we investigate novel problems of minimizing cost with guaranteed throughput, and maximizing throughput with minimal cost. We develop approximation algorithms which find reliable data routing in the WSN and strategically balance workload on the sinks. We prove that the delivered solutions are fractional of the optimum. We finally conclude our work and discuss potential research topics which derive from the studies of this thesis

    Function-specific schemes for verifiable computation

    Get PDF
    An integral component of modern computing is the ability to outsource data and computation to powerful remote servers, for instance, in the context of cloud computing or remote file storage. While participants can benefit from this interaction, a fundamental security issue that arises is that of integrity of computation: How can the end-user be certain that the result of a computation over the outsourced data has not been tampered with (not even by a compromised or adversarial server)? Cryptographic schemes for verifiable computation address this problem by accompanying each result with a proof that can be used to check the correctness of the performed computation. Recent advances in the field have led to the first implementations of schemes that can verify arbitrary computations. However, in practice the overhead of these general-purpose constructions remains prohibitive for most applications, with proof computation times (at the server) in the order of minutes or even hours for real-world problem instances. A different approach for designing such schemes targets specific types of computation and builds custom-made protocols, sacrificing generality for efficiency. An important representative of this function-specific approach is an authenticated data structure (ADS), where a specialized protocol is designed that supports query types associated with a particular outsourced dataset. This thesis presents three novel ADS constructions for the important query types of set operations, multi-dimensional range search, and pattern matching, and proves their security under cryptographic assumptions over bilinear groups. The scheme for set operations can support nested queries (e.g., two unions followed by an intersection of the results), extending previous works that only accommodate a single operation. The range search ADS provides an exponential (in the number of attributes in the dataset) asymptotic improvement from previous schemes for storage and computation costs. Finally, the pattern matching ADS supports text pattern and XML path queries with minimal cost, e.g., the overhead at the server is less than 4% compared to simply computing the result, for all our tested settings. The experimental evaluation of all three constructions shows significant improvements in proof-computation time over general-purpose schemes
    corecore