Search CORE

4 research outputs found

Finding Sources of Synchronizationfree Slices in Perfectly Nested Loops

Author: Bielecki W.
Siedlecki K.
Publication venue: Інститут проблем моделювання в енергетиці ім. Г.Є. Пухова НАН України
Publication date: 01/01/2007
Field of study

Algorithms, permitting us to find sources of synchronization-free slices of perfectly nested uniform and non-uniform loops, are presented. Sources extracted are to be used for creating synchronization-free-slices that can be executed independently preserving the lexicographic order of iterations in each slice. Our approach requires exact dependence analysis and based on operations on relations and sets. To describe and implement the algorithms, the dependence analysis by Pugh and Wonnacott was chosen where dependences are found in the form of tuple relations. The proposed algorithms have been implemented and verified by means of the Omega project software.Представлены алгоритмы, позволяющие находить несинхронизированные фрагменты, содержащие итерации полностью вложенных однородных и неоднородных циклов. Такие фрагменты могут выполняться независимо, сохраняя лексикографический порядок итераций в каждом фрагменте. Предложенный подход основан на операциях отношений и множеств и требует точного анализа зависимостей между операторами программы. Для описания и реализации алгоритмов выбран анализ зависимости по Пугу и Воннакоту, согласно которому зависимости отыскиваются в форме отношений кортежа. Описанные алгоритмы реализованы и верифицированы посредством программного пакета Omega project.Наведено алгоритми, що дозволяють знаходити несинхронізовані фрагменти, які вміщують ітерації повністю вкладених однорідних і неоднорідних циклів. Такі фрагменти можуть виконуватись незалежно, зберігаючи лексикографічний порядок ітерацій у кожному фрагменті. Запропонований підхід базується на операціях відношень та множин і потребує точного аналізу залежності між операторами програми. Для опису та реалізації алгоритмів обрано аналіз залежності по Пугу і Воннакоту, згідно з яким залежності знаходять у формі відношень кортежу. Описані алгоритми реалізовано і верифіковано за допомогою програмного пакета Omega project

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Extracting Synchronization-free Slices in Perfectly Nested Loops

Author: Bielecki W.
Siedlecki K.
Publication venue: Інститут проблем моделювання в енергетиці ім. Г.Є. Пухова НАН України
Publication date: 01/01/2007
Field of study

An algorithm, permitting us to extract iterations belonging to synchronization-free slices and to generate code enumerating sources of such slices and iterations of each slice in lexicographical order is presented. Synchronization-free slices can be executed independently preserving the lexicographic order of iterations in each slice. Our approach requires exact dependence analysis and based on operations on relations and sets. To describe and implement the algorithms, the dependence analysis by Pugh and Wonnacott was chosen where dependences are found in the form of tuple relations. The proposed algorithms have been implemented and verified by means of the Omega project software. Presburger arithmetic limitations are discussed. Results of experiments are presented. Tasks for future research are outlined.Представлен алгоритм, позволяющий выделить итерации, принадлежащие несинхронизированным фрагментам, и генерировать программу, перечисляющую источники таких фрагментов и итераций в каждом фрагменте в лексикографическом порядке. Несинхронизированные фрагменты могут выполняться независимо, сохраняя лексикографический порядок итераций в каждом фрагменте. Данный подход требует точного анализа зависимости и основан на операциях с отношениями и множествами. Для описания и реализации алгоритмов, выбран анализ зависимости по Пугу и Воннакоту, в котором найдены зависимости в форме отношений кортежа. Предложенные алгоритмы реализованы и верифицированы посредством программного пакета Омега. Представлены результаты экспериментов.Наведено алгоритм, що дозволяє виділити ітерації, які належать несинхронізованим фрагментaм, і генерувати програму, яка перелічує джерела таких фрагментів та ітерацій у кожному фрагменті в лексикографічному порядку. Несинхронізовані фрагменти можуть виконуватися незалежно, зберігаючи лексикографічний порядок ітерацій у кожному фрагменті. Даний підхід потребує точного аналізу залежності і базується на операціях з відношенням та множинами. Для описування та реалізації алгоритмів обрано аналіз залежності за Пуго та Воннакотом, у якому знайдено залежності у формі відношень кортежа. Запропоновані алгоритми реалізовано і верифіковано за допомогою програмного пакета Омега. Наведено результати експериментів

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers

Author: Kuei-Ping Shih et al.
Publication venue
Publication date: 01/01/2000
Field of study

This paper addresses the problem of communication-free partition of iteration spaces and data spaces along hyperplanes. To finding more possible communication-free hyperplane partitions, we treat statements within a loop body as separate schedulable units. Instead of using the information about data dependence distance or direction vectors, our technique explicitly formulates array references as transformations from statement-iteration spaces to data spaces. Based on these transformations, the necessary and sufficient conditions for communication-free partition along hyperplanes to be feasible have been proposed. This approach can be applied to all programs with an imperfectly nested loop or sequences of imperfectly nested loops, whose array references are affine functions of outer loop indices or loop invariant variables. The proposed approach is more practical than existing methods in finding the data and computation distribution patterns that can cause the processor to execute fully-parallel on multicomputers without any interprocessor communication

CiteSeerX

Automatic Data and Computation Mapping for Distributed-Memory Machines.

Author: Couvertier-reyes Isidoro
Publication venue: LSU Digital Commons
Publication date: 01/01/1996
Field of study

Distributed memory parallel computers offer enormous computation power, scalability and flexibility. However, these machines are difficult to program and this limits their widespread use. An important characteristic of these machines is the difference in the access time for data in local versus non-local memory; non-local memory accesses are much slower than local memory accesses. This is also a characteristic of shared memory machines but to a less degree. Therefore it is essential that as far as possible, the data that needs to be accessed by a processor during the execution of the computation assigned to it reside in its local memory rather than in some other processor\u27s memory. Several research projects have concluded that proper mapping of data is key to realizing the performance potential of distributed memory machines. Current language design efforts such as Fortran D and High Performance Fortran (HPF) are based on this. It is our thesis that for many practical codes, it is possible to derive good mappings through a combination of algorithms and systematic procedures. We view mapping as consisting of wo phases, alignment followed by distribution. For the alignment phase we present three constraint-based methods--one based on a linear programming formulation of the problem; the second formulates the alignment problem as a constrained optimization problem using Lagrange multipliers; the third method uses a heuristic to decide which constraints to leave unsatisfied (based on the penalty of increased communication incurred in doing so) in order to find a mapping. In addressing the distribution phase, we have developed two methods that integrate the placement of computation--loop nests in our case--with the mapping of data. For one distributed dimension, our approach finds the best combination of data and computation mapping that results in low communication overhead; this is done by choosing a loop order that allows message vectorization. In the second method, we introduce the distribution preference graph and the operations on this graph allow us to integrate loop restructuring transformations and data mapping. These techniques produce mappings that have been used in efficient hand-coded implementations of several benchmark codes

Louisiana State University