100 research outputs found

    Network Flows

    Get PDF
    Not Availabl

    Compiling dataflow graphs into hardware

    Get PDF
    Department Head: L. Darrell Whitley.2005 Fall.Includes bibliographical references (pages 121-126).Conventional computers are programmed by supplying a sequence of instructions that perform the desired task. A reconfigurable processor is "programmed" by specifying the interconnections between hardware components, thereby creating a "hardwired" system to do the particular task. For some applications such as image processing, reconfigurable processors can produce dramatic execution speedups. However, programming a reconfigurable processor is essentially a hardware design discipline, making programming difficult for application programmers who are only familiar with software design techniques. To bridge this gap, a programming language, called SA-C (Single Assignment C, pronounced "sassy"), has been designed for programming reconfigurable processors. The process involves two main steps - first, the SA-C compiler analyzes the input source code and produces a hardware-independent intermediate representation of the program, called a dataflow graph (DFG). Secondly, this DFG is combined with hardware-specific information to create the final configuration. This dissertation describes the design and implementation of a system that performs the DFG to hardware translation. The DFG is broken up into three sections: the data generators, the inner loop body, and the data collectors. The second of these, the inner loop body, is used to create a computational structure that is unique for each program. The other two sections are implemented by using prebuilt modules, parameterized for the particular problem. Finally, a "glue module" is created to connect the various pieces into a complete interconnection specification. The dissertation also explores optimizations that can be applied while processing the DFG, to improve performance. A technique for pipelining the inner loop body is described that uses an estimation tool for the propagation delay of the nodes within the dataflow graph. A scheme is also described that identifies subgraphs with the dataflow graph that can be replaced with lookup tables. The lookup tables provide a faster implementation than random logic in some instances

    Implementing primal-dual network flow algorithms

    Get PDF
    "Most of the results of this report were presented at the National ORSA/TIMS meeting, Chicago, May 1975." "7-113-77." Cover title.Bibliography: p. 35-38.Supported in part by the U.S. Deaprtment of Transportation, Transportation Advanced Research Program (TARP) contract no. DOT-TSC-1058 Supported in part by the Office of Naval Research under contract. N00014-75-C-0556by H. A. [i.e. Z] Aashtiani and T. L. Magnanti

    Dynamic Assignation of Roles and Tasks in Virtual Organizations of Agents

    Get PDF
    Nowadays, a common problem that affects the workflow and the results of an entity is the planning and distribution of tasks. Doing this manually implies anticipate workloads and employee characteristics, which is inefficient and almost uncalculated in high dynamic environments. In this paper, a model that generates a planning of tasks, minimizing the resources necessary for its accomplishment and obtains the maximum benefits is presented. Within this proposal, genetic algorithms, queuing theory, and CBR are used in different stages to obtain an efficient distribution. To test the system, the chosen case study that fits the scenario, is the e-Government where an elevated number of tasks must be solved in a precise term using the minimal resources

    Algorithmic issues in visual object recognition

    Get PDF
    This thesis is divided into two parts covering two aspects of research in the area of visual object recognition. Part I is about human detection in still images. Human detection is a challenging computer vision task due to the wide variability in human visual appearances and body poses. In this part, we present several enhancements to human detection algorithms. First, we present an extension to the integral images framework to allow for constant time computation of non-uniformly weighted summations over rectangular regions using a bundle of integral images. Such computational element is commonly used in constructing gradient-based feature descriptors, which are the most successful in shape-based human detection. Second, we introduce deformable features as an alternative to the conventional static features used in classifiers based on boosted ensembles. Deformable features can enhance the accuracy of human detection by adapting to pose changes that can be described as translations of body features. Third, we present a comprehensive evaluation framework for cascade-based human detectors. The presented framework facilitates comparison between cascade-based detection algorithms, provides a confidence measure for result, and deploys a practical evaluation scenario. Part II explores the possibilities of enhancing the speed of core algorithms used in visual object recognition using the computing capabilities of Graphics Processing Units (GPUs). First, we present an implementation of Graph Cut on GPUs, which achieves up to 4x speedup against compared to a CPU implementation. The Graph Cut algorithm has many applications related to visual object recognition such as segmentation and 3D point matching. Second, we present an efficient sparse approximation of kernel matrices for GPUs that can significantly speed up kernel based learning algorithms, which are widely used in object detection and recognition. We present an implementation of the Affinity Propagation clustering algorithm based on this representation, which is about 6 times faster than another GPU implementation based on a conventional sparse matrix representation

    Segmentation, tracking, and kinematics of lung parenchyma and lung tumors from 4D CT with application to radiation treatment planning.

    Get PDF
    This thesis is concerned with development of techniques for efficient computerized analysis of 4-D CT data. The goal is to have a highly automated approach to segmentation of the lung boundary and lung nodules inside the lung. The determination of exact lung tumor location over space and time by image segmentation is an essential step to track thoracic malignancies. Accurate image segmentation helps clinical experts examine the anatomy and structure and determine the disease progress. Since 4-D CT provides structural and anatomical information during tidal breathing, we use the same data to also measure mechanical properties related to deformation of the lung tissue including Jacobian and strain at high resolutions and as a function of time. Radiation Treatment of patients with lung cancer can benefit from knowledge of these measures of regional ventilation. Graph-cuts techniques have been popular for image segmentation since they are able to treat highly textured data via robust global optimization, avoiding local minima in graph based optimization. The graph-cuts methods have been used to extract globally optimal boundaries from images by s/t cut, with energy function based on model-specific visual cues, and useful topological constraints. The method makes N-dimensional globally optimal segmentation possible with good computational efficiency. Even though the graph-cuts method can extract objects where there is a clear intensity difference, segmentation of organs or tumors pose a challenge. For organ segmentation, many segmentation methods using a shape prior have been proposed. However, in the case of lung tumors, the shape varies from patient to patient, and with location. In this thesis, we use a shape prior for tumors through a training step and PCA analysis based on the Active Shape Model (ASM). The method has been tested on real patient data from the Brown Cancer Center at the University of Louisville. We performed temporal B-spline deformable registration of the 4-D CT data - this yielded 3-D deformation fields between successive respiratory phases from which measures of regional lung function were determined. During the respiratory cycle, the lung volume changes and five different lobes of the lung (two in the left and three in the right lung) show different deformation yielding different strain and Jacobian maps. In this thesis, we determine the regional lung mechanics in the Lagrangian frame of reference through different respiratory phases, for example, Phase10 to 20, Phase10 to 30, Phase10 to 40, and Phase10 to 50. Single photon emission computed tomography (SPECT) lung imaging using radioactive tracers with SPECT ventilation and SPECT perfusion imaging also provides functional information. As part of an IRB-approved study therefore, we registered the max-inhale CT volume to both VSPECT and QSPECT data sets using the Demon\u27s non-rigid registration algorithm in patient subjects. Subsequently, statistical correlation between CT ventilation images (Jacobian and strain values), with both VSPECT and QSPECT was undertaken. Through statistical analysis with the Spearman\u27s rank correlation coefficient, we found that Jacobian values have the highest correlation with both VSPECT and QSPECT

    Design and analysis of sequential and parallel single-source shortest-paths algorithms

    Get PDF
    We study the performance of algorithms for the Single-Source Shortest-Paths (SSSP) problem on graphs with n nodes and m edges with nonnegative random weights. All previously known SSSP algorithms for directed graphs required superlinear time. Wie give the first SSSP algorithms that provably achieve linear O(n-m)average-case execution time on arbitrary directed graphs with random edge weights. For independent edge weights, the linear-time bound holds with high probability, too. Additionally, our result implies improved average-case bounds for the All-Pairs Shortest-Paths (APSP) problem on sparse graphs, and it yields the first theoretical average-case analysis for the "Approximate Bucket Implementation" of Dijkstra\u27s SSSP algorithm (ABI-Dijkstra). Futhermore, we give constructive proofs for the existence of graph classes with random edge weights on which ABI-Dijkstra and several other well-known SSSP algorithms require superlinear average-case time. Besides the classical sequential (single processor) model of computation we also consider parallel computing: we give the currently fastest average-case linear-work parallel SSSP algorithms for large graph classes with random edge weights, e.g., sparse rondom graphs and graphs modeling the WWW, telephone calls or social networks.In dieser Arbeit untersuchen wir die Laufzeiten von Algorithmen für das Kürzeste-Wege Problem (Single-Source Shortest-Paths, SSSP) auf Graphen mit n Knoten, M Kanten und nichtnegativen zufälligen Kantengewichten. Alle bisherigen SSSP Algorithmen benötigen auf gerichteten Graphen superlineare Zeit. Wir stellen den ersten SSSP Algorithmus vor, der auf beliebigen gerichteten Graphen mit zufälligen Kantengewichten eine beweisbar lineare average-case-Komplexität O(n+m)aufweist. Sind die Kantengewichte unabhängig, so wird die lineare Zeitschranke auch mit hoher Wahrscheinlichkeit eingehalten. Außerdem impliziert unser Ergebnis verbesserte average-case-Schranken für das All-Pairs Shortest-Paths (APSP) Problem auf dünnen Graphen und liefert die erste theoretische average-case-Analyse für die "Approximate Bucket Implementierung" von Dijkstras SSSP Algorithmus (ABI-Dijkstra). Weiterhin führen wir konstruktive Existenzbeweise für Graphklassen mit zufälligen Kantengewichten, auf denen ABI-Dijkstra und mehrere andere bekannte SSSP Algorithmen durchschnittlich superlineare Zeit benötigen. Neben dem klassischen seriellen (Ein-Prozessor) Berechnungsmodell betrachten wir auch Parallelverarbeitung; für umfangreiche Graphklassen mit zufälligen Kantengewichten wie z.B. dünne Zufallsgraphen oder Modelle für das WWW, Telefonanrufe oder soziale Netzwerke stellen wir die derzeit schnellsten parallelen SSSP Algorithmen mit durchschnittlich linearer Arbeit vor

    Matching

    Get PDF

    Design of Heuristic Algorithms for Hard Optimization

    Get PDF
    This open access book demonstrates all the steps required to design heuristic algorithms for difficult optimization. The classic problem of the travelling salesman is used as a common thread to illustrate all the techniques discussed. This problem is ideal for introducing readers to the subject because it is very intuitive and its solutions can be graphically represented. The book features a wealth of illustrations that allow the concepts to be understood at a glance. The book approaches the main metaheuristics from a new angle, deconstructing them into a few key concepts presented in separate chapters: construction, improvement, decomposition, randomization and learning methods. Each metaheuristic can then be presented in simplified form as a combination of these concepts. This approach avoids giving the impression that metaheuristics is a non-formal discipline, a kind of cloud sculpture. Moreover, it provides concrete applications of the travelling salesman problem, which illustrate in just a few lines of code how to design a new heuristic and remove all ambiguities left by a general framework. Two chapters reviewing the basics of combinatorial optimization and complexity theory make the book self-contained. As such, even readers with a very limited background in the field will be able to follow all the content

    Evaluation of a Flow-Based Hypergraph Bipartitioning Algorithm

    Get PDF
    In this paper, we propose HyperFlowCutter, an algorithm for balanced hypergraph bipartitioning that is based on minimum S-T hyperedge cuts and maximum flows. It computes a sequence of bipartitions that optimize cut size and balance in the Pareto sense, being able to trade one for the other. HyperFlowCutter builds on the FlowCutter algorithm for partitioning graphs. We propose additional features, such as handling disconnected hypergraphs, novel methods for obtaining starting S,T pairs as well as an approach to refine a given partition with HyperFlowCutter. Our main contribution is ReBaHFC, a new algorithm which obtains an initial partition with the fast multilevel hypergraph partitioner PaToH and then improves it using HyperFlowCutter as a refinement algorithm. ReBaHFC is able to significantly improve the solution quality of PaToH at little additional running time. The solution quality is only marginally worse than that of the best-performing hypergraph partitioners KaHyPar and hMETIS, while being one order of magnitude faster. Thus ReBaHFC offers a new time-quality trade-off in the current spectrum of hypergraph partitioners. For the special case of perfectly balanced bipartitioning, only the much slower plain HyperFlowCutter yields slightly better solutions than ReBaHFC, while only PaToH is faster than ReBaHFC
    • …
    corecore