516 research outputs found

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

    Parallel solution of linear programs

    Get PDF
    The factors limiting the performance of computer software periodically undergo sudden shifts, resulting from technological progress, and these shifts can have profound implications for the design of high performance codes. At the present time, the speed with which hardware can execute a single stream of instructions has reached a plateau. It is now the number of instruction streams that may be executed concurrently which underpins estimates of compute power, and with this change, a critical limitation on the performance of software has come to be the degree to which it can be parallelised. The research in this thesis is concerned with the means by which codes for linear programming may be adapted to this new hardware. For the most part, it is codes implementing the simplex method which will be discussed, though these have typically lower performance for single solves than those implementing interior point methods. However, the ability of the simplex method to rapidly re-solve a problem makes it at present indispensable as a subroutine for mixed integer programming. The long history of the simplex method as a practical technique, with applications in many industries and government, has led to such codes reaching a great level of sophistication. It would be unexpected in a research project such as this one to match the performance of top commercial codes with many years of development behind them. The simplex codes described in this thesis are, however, able to solve real problems of small to moderate size, rather than being confined to random or otherwise artificially generated instances. The remainder of this thesis is structured as follows. The rest of this chapter gives a brief overview of the essential elements of modern parallel hardware and of the linear programming problem. Both the simplex method and interior point methods are discussed, along with some of the key algorithmic enhancements required for such systems to solve real-world problems. Some background on the parallelisation of both types of code is given. The next chapter describes two standard simplex codes designed to exploit the current generation of hardware. i6 is a parallel standard simplex solver capable of being applied to a range of real problems, and showing exceptional performance for dense, square programs. i8 is also a parallel, standard simplex solver, but now implemented for graphics processing units (GPUs)

    Advanced Timing and Synchronization Methodologies for Digital VLSI Integrated Circuits

    Get PDF
    This dissertation addresses timing and synchronization methodologies that are critical to the design, analysis and optimization of high-performance, integrated digital VLSI systems. As process sizes shrink and design complexities increase, achieving timing closure for digital VLSI circuits becomes a significant bottleneck in the integrated circuit design flow. Circuit designers are motivated to investigate and employ alternative methods to satisfy the timing and physical design performance targets. Such novel methods for the timing and synchronization of complex circuitry are developed in this dissertation and analyzed for performance and applicability.Mainstream integrated circuit design flow is normally tuned for zero clock skew, edge-triggered circuit design. Non-zero clock skew or multi-phase clock synchronization is seldom used because the lack of design automation tools increases the length and cost of the design cycle. For similar reasons, level-sensitive registers have not become an industry standard despite their superior size, speed and power consumption characteristics compared to conventional edge-triggered flip-flops.In this dissertation, novel design and analysis techniques that fully automate the design and analysis of non-zero clock skew circuits are presented. Clock skew scheduling of both edge-triggered and level-sensitive circuits are investigated in order to exploit maximum circuit performances. The effects of multi-phase clocking on non-zero clock skew, level-sensitive circuits are investigated leading to advanced synchronization methodologies. Improvements in the scalability of the computational timing analysis process with clock skew scheduling are explored through partitioning and parallelization.The integration of the proposed design and analysis methods to the physical design flow of integrated circuits synchronized with a next-generation clocking technology-resonant rotary clocking technology-is also presented. Based on the design and analysis methods presented in this dissertation, a computer-aided design tool for the design of rotary clock synchronized integrated circuits is developed

    High performance simplex solver

    Get PDF
    The dual simplex method is frequently the most efficient technique for solving linear programming (LP) problems. This thesis describes an efficient implementation of the sequential dual simplex method and the design and development of two parallel dual simplex solvers. In serial, many advanced techniques for the (dual) simplex method are implemented, including sparse LU factorization, hyper-sparse linear system solution technique, efficient approaches to updating LU factors and sophisticated dual simplex pivoting rules. These techniques, some of which are novel, lead to serial performance which is comparable with the best public domain dual simplex solver, providing a solid foundation for the simplex parallelization. During the implementation of the sequential dual simplex solver, the study of classic LU factor update techniques leads to the development of three novel update variants. One of them is comparable to the most efficient established approach but is much simpler in terms of implementation, and the other two are specially useful for one of the parallel simplex solvers. In addition, the study of the dual simplex pivoting rules identifies and motivates further investigation of how hyper-sparsity maybe promoted. In parallel, two high performance simplex solvers are designed and developed. One approach, based on a less-known dual pivoting rule called suboptimization, exploits parallelism across multiple iterations (PAMI). The other, based on the regular dual pivoting rule, exploits purely single iteration parallelism (SIP). The performance of PAMI is comparable to a world-leading commercial simplex solver. SIP is frequently complementary to PAMI in achieving speedup when PAMI results in slowdown

    Performance controls for distributed telecommunication services

    Get PDF
    As the Internet and Telecommunications domains merge, open telecommunication service architectures such as TINA, PARLAY and PINT are becoming prevalent. Distributed Computing is a common engineering component in these technologies and promises to bring improvements to the scalability, reliability and flexibility of telecommunications service delivery systems. This distributed approach to service delivery introduces new performance concerns. As service logic is decomposed into software components and distnbuted across network resources, significant additional resource loading is incurred due to inter-node communications. This fact makes the choice of distribution of components in the network and the distribution of load between these components critical design and operational issues which must be resolved to guarantee a high level of service for the customer and a profitable network for the service operator. Previous research in the computer science domain has addressed optimal placement of components from the perspectives of minimising run time, minimising communications costs or balancing of load between network resources. This thesis proposes a more extensive optimisation model, which we argue, is more useful for addressing concerns pertinent to the telecommunications domain. The model focuses on providing optimal throughput and profitability of network resources and on overload protection whilst allowing flexibility in terms of the cost of installation of component copies and differentiation in the treatment of service types, in terms of fairness to the customer and profitability to the operator. Both static (design-time) component distribution and dynamic (run-time) load distribution algorithms are developed using Linear and Mixed Integer Programming techniques. An efficient, but sub-optimal, run-time solution, employing Market-based control, is also proposed. The performance of these algorithms is investigated using a simulation model of a distributed service platform, which is based on TINA service components interacting with the Intelligent Network through gateways. Simulation results are verified using Layered Queuing Network analytic modelling Results show significant performance gains over simpler methods of performance control and demonstrate how trade-offs in network profitability, fairness and network cost are possible

    A task-based approach to parallel parametric linear programming solving, and application to polyhedral computations

    Full text link
    Parametric linear programming is a central operation for polyhedral computations, as well as in certain control applications.Here we propose a task-based scheme for parallelizing it, with quasi-linear speedup over large problems.This type of parallel applications is challenging, because several tasks mightbe computing the same region. In this paper, we are presenting thealgorithm itself with a parallel redundancy elimination algorithm, andconducting a thorough performance analysis.Comment: arXiv admin note: text overlap with arXiv:1904.0607

    An Adaptive Linear Approximation Algorithm for Copositive Programs

    Get PDF
    We study linear optimization problems over the cone of copositive matrices. These problems appear in nonconvex quadratic and binary optimization; for instance, the maximum clique problem and other combinatorial problems can be reformulated as such problems. We present new polyhedral inner and outer approximations of the copositive cone which we show to be exact in the limit. In contrast to previous approximation schemes, our approximation is not necessarily uniform for the whole cone but can be guided adaptively through the objective function, yielding a good approximation in those parts of the cone that are relevant for the optimization and only a coarse approximation in those parts that are not. Using these approximations, we derive an adaptive linear approximation algorithm for copositive programs. Numerical experiments show that our algorithm gives very good results for certain nonconvex quadratic problems
