516 research outputs found
Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing
Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM
Recommended from our members
Software tools for stochastic programming: A Stochastic Programming Integrated Environment (SPInE)
SP models combine the paradigm of dynamic linear programming with
modelling of random parameters, providing optimal decisions which hedge
against future uncertainties. Advances in hardware as well as software
techniques and solution methods have made SP a viable optimisation tool.
We identify a growing need for modelling systems which support the creation
and investigation of SP problems. Our SPInE system integrates a number of
components which include a flexible modelling tool (based on stochastic
extensions of the algebraic modelling languages AMPL and MPL), stochastic
solvers, as well as special purpose scenario generators and database tools.
We introduce an asset/liability management model and illustrate how SPInE
can be used to create and process this model as a multistage SP application
Parallel solution of linear programs
The factors limiting the performance of computer software periodically undergo sudden shifts, resulting from technological progress, and these shifts can have profound implications for the design of high performance codes. At the present time, the speed with which hardware can execute a single stream of instructions has reached a plateau. It is now the number of instruction streams that may be executed concurrently which underpins estimates of compute power, and with this change, a critical limitation on the performance of software has come to be the degree to which it can be parallelised. The research in this thesis is concerned with the means by which codes for linear programming may be adapted to this new hardware. For the most part, it is codes implementing the simplex method which will be discussed, though
these have typically lower performance for single solves than those implementing interior point methods. However, the ability of the simplex method to rapidly re-solve a problem makes it at present indispensable as a subroutine for mixed integer programming. The long history of the simplex method as a practical technique, with applications in many industries and government, has led to such codes reaching a great level of sophistication. It would be unexpected in a research project such as this one to match the performance of top commercial codes with many years of development behind them. The simplex codes described in this thesis are, however, able to solve real problems of small to moderate size, rather than
being confined to random or otherwise artificially generated instances. The remainder of this thesis is structured as follows. The rest of this chapter gives a brief overview of the essential elements of modern parallel hardware and of the linear programming problem. Both the simplex method and interior point methods are discussed, along with some of the key algorithmic enhancements required for such systems to solve real-world problems. Some background on the parallelisation of both types of code is given. The next chapter describes two standard simplex codes designed to exploit the current generation of hardware. i6 is a parallel standard simplex solver capable of being applied to a range of real problems, and showing exceptional performance for dense, square programs. i8 is also a parallel, standard simplex solver, but now implemented for graphics processing units (GPUs)
Advanced Timing and Synchronization Methodologies for Digital VLSI Integrated Circuits
This dissertation addresses timing and synchronization methodologies that are critical to the design, analysis and optimization of high-performance, integrated digital VLSI systems. As process sizes shrink and design complexities increase, achieving timing closure for digital VLSI circuits becomes a significant bottleneck in the integrated circuit design flow. Circuit designers are motivated to investigate and employ alternative methods to satisfy the timing and physical design performance targets. Such novel methods for the timing and synchronization of complex circuitry are developed in this dissertation and analyzed for performance and applicability.Mainstream integrated circuit design flow is normally tuned for zero clock skew, edge-triggered circuit design. Non-zero clock skew or multi-phase clock synchronization is seldom used because the lack of design automation tools increases the length and cost of the design cycle. For similar reasons, level-sensitive registers have not become an industry standard despite their superior size, speed and power consumption characteristics compared to conventional edge-triggered flip-flops.In this dissertation, novel design and analysis techniques that fully automate the design and analysis of non-zero clock skew circuits are presented. Clock skew scheduling of both edge-triggered and level-sensitive circuits are investigated in order to exploit maximum circuit performances. The effects of multi-phase clocking on non-zero clock skew, level-sensitive circuits are investigated leading to advanced synchronization methodologies. Improvements in the scalability of the computational timing analysis process with clock skew scheduling are explored through partitioning and parallelization.The integration of the proposed design and analysis methods to the physical design flow of integrated circuits synchronized with a next-generation clocking technology-resonant rotary clocking technology-is also presented. Based on the design and analysis methods presented in this dissertation, a computer-aided design tool for the design of rotary clock synchronized integrated circuits is developed
High performance simplex solver
The dual simplex method is frequently the most efficient technique for solving linear programming
(LP) problems. This thesis describes an efficient implementation of the sequential dual
simplex method and the design and development of two parallel dual simplex solvers.
In serial, many advanced techniques for the (dual) simplex method are implemented, including
sparse LU factorization, hyper-sparse linear system solution technique, efficient approaches
to updating LU factors and sophisticated dual simplex pivoting rules. These techniques, some
of which are novel, lead to serial performance which is comparable with the best public domain
dual simplex solver, providing a solid foundation for the simplex parallelization.
During the implementation of the sequential dual simplex solver, the study of classic LU
factor update techniques leads to the development of three novel update variants. One of them
is comparable to the most efficient established approach but is much simpler in terms of implementation,
and the other two are specially useful for one of the parallel simplex solvers. In
addition, the study of the dual simplex pivoting rules identifies and motivates further investigation
of how hyper-sparsity maybe promoted.
In parallel, two high performance simplex solvers are designed and developed. One approach,
based on a less-known dual pivoting rule called suboptimization, exploits parallelism across
multiple iterations (PAMI). The other, based on the regular dual pivoting rule, exploits purely
single iteration parallelism (SIP). The performance of PAMI is comparable to a world-leading
commercial simplex solver. SIP is frequently complementary to PAMI in achieving speedup
when PAMI results in slowdown
Performance controls for distributed telecommunication services
As the Internet and Telecommunications domains merge, open telecommunication service architectures such as TINA, PARLAY and PINT are becoming prevalent. Distributed Computing is a common engineering component in these technologies and promises to bring improvements to the scalability, reliability and flexibility of telecommunications service delivery systems. This distributed approach to service delivery introduces new performance concerns. As service logic is decomposed into software components and distnbuted across network resources, significant additional resource loading is incurred due to inter-node communications. This fact makes the choice of distribution of components in the network and the distribution of load between these components critical design and operational issues which must be resolved to guarantee a high level of service for the customer and a profitable network for the service operator.
Previous research in the computer science domain has addressed optimal placement of components from the perspectives of minimising run time, minimising communications costs or balancing of load between network resources. This thesis proposes a more extensive optimisation model, which we argue, is more useful for addressing concerns pertinent to the telecommunications domain. The model focuses on providing optimal throughput and profitability of network resources and on overload protection whilst allowing flexibility in terms of the cost of installation of component copies and differentiation in the treatment of service types, in terms of fairness to the customer and profitability to the operator. Both static (design-time) component distribution and dynamic (run-time) load distribution algorithms are developed using Linear and Mixed Integer Programming techniques. An efficient, but sub-optimal, run-time solution, employing Market-based control, is also proposed.
The performance of these algorithms is investigated using a simulation model of a distributed service platform, which is based on TINA service components interacting with the Intelligent Network through gateways. Simulation results are verified using Layered Queuing Network analytic modelling Results show significant performance gains over simpler methods of performance control and demonstrate how trade-offs in network profitability, fairness and network cost are possible
Recommended from our members
Parallel simplex algorithms and loop spreading
Parallel solutions for two classes of linear programs are
presented. First we parallelized the two-phase revised simplex
algorithm and showed that it is possible to get linear improvement in
performance. The simplex algorithm is the best known algorithm for
solving linear programs, and we claim our result is the best one
which can be achieved.
Next we study the parallelization of the decomposed simplex
algorithm. One of our new parallel algorithms has achieved 2*P time
of performance improvement over the decomposed simplex
algorithm using P processors. Meanwhile, we discovered a particular
variation of the decomposed simplex algorithm which can run 2
times faster than the original one. The new parallel algorithm
linearly speedups the fast sequential algorithm.
As in any parallel program, unbalanced processor load causes
the performance of the parallel decomposed simplex algorithm to
drop significantly when the size of the input data is not a multiple of
the number of available processors. To remove this limitation, we
invented a load balance technique called Loop Spreading that evenly
distributes parallel tasks on multiple processors without a drop in
performance even when the size of the input data is not a multiple of
the number of processors. Loop Spreading is a general technique
that can be used automatically by a compiler to balance processor
load in any language that supports parallel loop constructs
A task-based approach to parallel parametric linear programming solving, and application to polyhedral computations
Parametric linear programming is a central operation for polyhedral
computations, as well as in certain control applications.Here we propose a
task-based scheme for parallelizing it, with quasi-linear speedup over large
problems.This type of parallel applications is challenging, because several
tasks mightbe computing the same region. In this paper, we are presenting
thealgorithm itself with a parallel redundancy elimination algorithm,
andconducting a thorough performance analysis.Comment: arXiv admin note: text overlap with arXiv:1904.0607
An Adaptive Linear Approximation Algorithm for Copositive Programs
We study linear optimization problems over the cone of copositive matrices. These problems appear in nonconvex quadratic and binary optimization; for instance, the maximum clique problem and other combinatorial problems can be reformulated as such problems. We present new polyhedral inner and outer approximations of the copositive cone which we show to be exact in the limit. In contrast to previous approximation schemes, our approximation is not necessarily uniform for the whole cone but can be guided adaptively through the objective function, yielding a good approximation in those parts of the cone that are relevant for the optimization and only a coarse approximation in those parts that are not. Using these approximations, we derive an adaptive linear approximation algorithm for copositive programs. Numerical experiments show that our algorithm gives very good results for certain nonconvex quadratic problems
- âŠ