4 research outputs found

    Accurate and efficient explicit approximations of the Colebrook flow friction equation based on the Wright ω-function

    Get PDF
    The Colebrook equation is a popular model for estimating friction loss coefficients in water and gas pipes. The model is implicit in the unknown flow friction factor, f . To date, the captured flow friction factor, f , can be extracted from the logarithmic form analytically only in the term of the Lambert W-function. The purpose of this study is to find an accurate and computationally efficient solution based on the shifted Lambert W-function also known as the Wright ω-function. The Wright ω-function is more suitable because it overcomes the problem with the overflow error by switching the fast growing term, y=W(ex), of the Lambert W-function to series expansions that further can be easily evaluated in computers without causing overflow run-time errors. Although the Colebrook equation transformed through the Lambert W-function is identical to the original expression in terms of accuracy, a further evaluation of the Lambert W-function can be only approximate. Very accurate explicit approximations of the Colebrook equation that contain only one or two logarithms are shown. The final result is an accurate explicit approximation of the Colebrook equation with a relative error of no more than 0.0096%. The presented approximations are in a form suitable for everyday engineering use, and are both accurate and computationally efficient

    Floating-Point Division and Square Root using a Taylor-Series Expansion Algorithm

    No full text
    Hardware support for floating-point (FP) arithmetic is a mandatory feature of modern microprocessor design. Although division and square root are relatively infrequent operations in traditional general-purpose applications, they are indispensable and becoming increasingly important in many modern applications. Therefore, overall performance can be greatly affected by the algorithms and the implementations used for designing FP-div and FP-sqrt units. In this paper, a fused floating-point multiply/divide/square root unit based on Taylor-series expansion algorithm is proposed. We extended an existing multiply/divide fused unit to incorporate the square root function with little area and latency overhead since Taylor’s theorem enables us to compute approximations for many well-known functions with very similar forms. The proposed arithmetic unit exhibits a reasonably good area performance balance

    The pipelined HIP processor in FPGA with the debugging environment

    Get PDF
    This thesis describes the implementation of a central processing unit with pipeline called Hypothetical processor (HIP), which is described in book [1]. It contains logic for data forwarding, an adder for floating point numbers and it has an instruction and data cache. Through the debug unit it is possible to read from and write to all general and to other registers in the HIP pipeline and therefore monitor the flow of the compiled program. HIP runs in the FPGA chip on the Spartan 3E development board where supporting logic for monitoring is present. The external program written in Java runs on different operating systems. The monitoring program contains a text editor where it is possible to write in the assembler language. It also contains a compiler which translates an assembler code to HIP machine code. Operations and data are sent to the debug unit to HIP. Each clock cycle, the monitoring program reads the content of every register in the CPU. The content of the main memory and cache is seen too

    An FPGA implementation of an investigative many-core processor, Fynbos : in support of a Fortran autoparallelising software pipeline

    Get PDF
    Includes bibliographical references.In light of the power, memory, ILP, and utilisation walls facing the computing industry, this work examines the hypothetical many-core approach to finding greater compute performance and efficiency. In order to achieve greater efficiency in an environment in which Moore’s law continues but TDP has been capped, a means of deriving performance from dark and dim silicon is needed. The many-core hypothesis is one approach to exploiting these available transistors efficiently. As understood in this work, it involves trading in hardware control complexity for hundreds to thousands of parallel simple processing elements, and operating at a clock speed sufficiently low as to allow the efficiency gains of near threshold voltage operation. Performance is there- fore dependant on exploiting a new degree of fine-grained parallelism such as is currently only found in GPGPUs, but in a manner that is not as restrictive in application domain range. While removing the complex control hardware of traditional CPUs provides space for more arithmetic hardware, a basic level of control is still required. For a number of reasons this work chooses to replace this control largely with static scheduling. This pushes the burden of control primarily to the software and specifically the compiler, rather not to the programmer or to an application specific means of control simplification. An existing legacy tool chain capable of autoparallelising sequential Fortran code to the degree of parallelism necessary for many-core exists. This work implements a many-core architecture to match it. Prototyping the design on an FPGA, it is possible to examine the real world performance of the compiler-architecture system to a greater degree than simulation only would allow. Comparing theoretical peak performance and real performance in a case study application, the system is found to be more efficient than any other reviewed, but to also significantly under perform relative to current competing architectures. This failing is apportioned to taking the need for simple hardware too far, and an inability to implement static scheduling mitigating tactics due to lack of support for such in the compiler
    corecore