30,467 research outputs found

    Vienna FORTRAN: A FORTRAN language extension for distributed memory multiprocessors

    Get PDF
    Exploiting the performance potential of distributed memory machines requires a careful distribution of data across the processors. Vienna FORTRAN is a language extension of FORTRAN which provides the user with a wide range of facilities for such mapping of data structures. However, programs in Vienna FORTRAN are written using global data references. Thus, the user has the advantage of a shared memory programming paradigm while explicitly controlling the placement of data. The basic features of Vienna FORTRAN are presented along with a set of examples illustrating the use of these features

    Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

    Get PDF
    The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

    A survey of parallel execution strategies for transitive closure and logic programs

    Get PDF
    An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods

    A cyclic time-dependent Markov process to model daily patterns in wind turbine power production

    Get PDF
    Wind energy is becoming a top contributor to the renewable energy mix, which raises potential reliability issues for the grid due to the fluctuating nature of its source. To achieve adequate reserve commitment and to promote market participation, it is necessary to provide models that can capture daily patterns in wind power production. This paper presents a cyclic inhomogeneous Markov process, which is based on a three-dimensional state-space (wind power, speed and direction). Each time-dependent transition probability is expressed as a Bernstein polynomial. The model parameters are estimated by solving a constrained optimization problem: The objective function combines two maximum likelihood estimators, one to ensure that the Markov process long-term behavior reproduces the data accurately and another to capture daily fluctuations. A convex formulation for the overall optimization problem is presented and its applicability demonstrated through the analysis of a case-study. The proposed model is capable of reproducing the diurnal patterns of a three-year dataset collected from a wind turbine located in a mountainous region in Portugal. In addition, it is shown how to compute persistence statistics directly from the Markov process transition matrices. Based on the case-study, the power production persistence through the daily cycle is analysed and discussed

    Compiling global name-space programs for distributed execution

    Get PDF
    Distributed memory machines do not provide hardware support for a global address space. Thus programmers are forced to partition the data across the memories of the architecture and use explicit message passing to communicate data between processors. The compiler support required to allow programmers to express their algorithms using a global name-space is examined. A general method is presented for analysis of a high level source program and along with its translation to a set of independently executing tasks communicating via messages. If the compiler has enough information, this translation can be carried out at compile-time. Otherwise run-time code is generated to implement the required data movement. The analysis required in both situations is described and the performance of the generated code on the Intel iPSC/2 is presented

    Single-Carrier Modulation versus OFDM for Millimeter-Wave Wireless MIMO

    Full text link
    This paper presents results on the achievable spectral efficiency and on the energy efficiency for a wireless multiple-input-multiple-output (MIMO) link operating at millimeter wave frequencies (mmWave) in a typical 5G scenario. Two different single-carrier modem schemes are considered, i.e., a traditional modulation scheme with linear equalization at the receiver, and a single-carrier modulation with cyclic prefix, frequency-domain equalization and FFT-based processing at the receiver; these two schemes are compared with a conventional MIMO-OFDM transceiver structure. Our analysis jointly takes into account the peculiar characteristics of MIMO channels at mmWave frequencies, the use of hybrid (analog-digital) pre-coding and post-coding beamformers, the finite cardinality of the modulation structure, and the non-linear behavior of the transmitter power amplifiers. Our results show that the best performance is achieved by single-carrier modulation with time-domain equalization, which exhibits the smallest loss due to the non-linear distortion, and whose performance can be further improved by using advanced equalization schemes. Results also confirm that performance gets severely degraded when the link length exceeds 90-100 meters and the transmit power falls below 0 dBW.Comment: accepted for publication on IEEE Transactions on Communication

    Development of a dc-ac power conditioner for wind generator by using neural network

    Get PDF
    This project present of development single phase DC-AC converter for wind generator application. The mathematical model of the wind generator and Artificial Neural Network control for DC-AC converter is derived. The controller is designed to stabilize the output voltage of DC-AC converter. To verify the effectiveness of the proposal controller, both simulation and experimental are developed. The simulation and experimental result show that the amplitude of output voltage of the DC-AC converter can be controlled
    • …
    corecore