16 research outputs found

    Parallel Out-of-Core Sorting: The Third Way

    Get PDF
    Sorting very large datasets is a key subroutine in almost any application that is built on top of a large database. Two ways to sort out-of-core data dominate the literature: merging-based algorithms and partitioning-based algorithms. Within these two paradigms, all the programs that sort out-of-core data on a cluster rely on assumptions about the input distribution. We propose a third way of out-of-core sorting: oblivious algorithms. In all, we have developed six programs that sort out-of-core data on a cluster. The first three programs, based completely on Leighton\u27s columnsort algorithm, have a restriction on the maximum problem size that they can sort. The other three programs relax this restriction; two are based on our original algorithmic extensions to columnsort. We present experimental results to show that our algorithms perform well. To the best of our knowledge, the programs presented in this thesis are the first to sort out-of-core data on a cluster without making any simplifying assumptions about the distribution of the data to be sorted

    Enhancing Asynchronous Parallel Computing

    Get PDF
    In applications using large amounts of data, hiding the latency inherent in accessing data far from the processor is often necessary in order to achieve high performance. Several researchers have observed that one way to address the challenge of latency is by using a common structure: in a series of passes, the program reads in the data, performs various operations on it, and writes out the data. Passes often consist of a pipeline structure composed of different stages. In order to achieve high performance, the stages are frequently overlapped, for example, by using asynchronous threads. Out-of-core parallel programs provide one such example of this pattern. The development and debugging time resulting from coordinating overlapping stages, however, can be substantial. Moreover, modifying the structure of the overlap in an attempt to achieve higher performance can require significant additional time on the part of the programmer. This thesis presents FG, a Framework Generator designed to coordinate the stages of a pipeline and allow the programmer to easily experiment with the pipeline\u27s structure, thus significantly reducing time to solution. We also discuss preliminary results of using FG in an out-of-core sorting program

    Algorithmic ramifications of prefetching in memory hierarchy

    Get PDF
    External Memory models, most notable being the I-O Model [3], capture the effects of memory hierarchy and aid in algorithm design. More than a decade of architectural advancements have led to new features not captured in the I-O model - most notably the prefetching capability. We propose a relatively simple Prefetch model that incorporates data prefetching in the traditional I-O models and show how to design algorithms that can attain close to peak memory bandwidth. Unlike (the inverse of) memory latency, the memory bandwidth is much closer to the processing speed, thereby, intelligent use of prefetching can considerably mitigate the I-O bottleneck. For some fundamental problems, our algorithms attain running times approaching that of the idealized Random Access Machines under reasonable assumptions. Our work also explains the significantly superior performance of the I-O efficient algorithms in systems that support prefetching compared to ones that do not

    Department of Computer Science Activity 1998-2004

    Get PDF
    This report summarizes much of the research and teaching activity of the Department of Computer Science at Dartmouth College between late 1998 and late 2004. The material for this report was collected as part of the final report for NSF Institutional Infrastructure award EIA-9802068, which funded equipment and technical staff during that six-year period. This equipment and staff supported essentially all of the department\u27s research activity during that period

    3-Dimensional Automated Heat Flux Calibration Device

    Get PDF
    This document aims to describe the problems in current radiant heat source heat flux calibration techniques and the approach our team took to solve them through automation. The following sections outline the basic premise of the problem we addressed and who our end product benefited. The proceeding sections addresses the research that we have performed regarding heat flux measurements and automation. This research includes current solutions – mostly partial solutions for problems that are similar but not exactly like ours. Following the background research, we define objectives, with specific details that outline how we evaluated different possible solutions, and how we decided upon our final approach. We discuss our detailed plan used to accomplish each of these objectives, the design process leading to a final design, and the analysis that led to and verified this design. Finally the results and conclusions of the completed project are included

    A Fundamental Model Methodology for the Analysis, Design and Fabrication of a Narrow Transparency Window in a Bulk Meta-Material

    Get PDF
    abstract: The optical valley of water, where water is transparent only in the visible range, is a fascinating phenomenon and cannot be modeled by conventional dielectric material modeling. While dielectric properties of materials can be modeled as a sum of Lorentz or Debye simple harmonic oscillators, water is the exception. In 1992 Diaz and Alexopoulos published a causal and passive circuit model that predicted the window of water by adding a “zero shunt” circuit in parallel with every Debye and Lorentz circuit branch. Other than the Diaz model, extensive literature survey yielded no universal dielectric material model that included water or offered an explanation for this window phenomenon. A hybrid phenomenological model of water, proposed by Shubitidze and Osterberg, was the only model other than the Diaz-Alexopoulos model that tried to predict and match the optical valley of water. However, we show that when we apply the requirement that the permittivity function must be a complex analytic function, it fails our test of causality and the model terms lack physical meaning, exhibiting various mathematical and physical contradictions. Left with only the Diaz proposed fundamental model as the only casual model, this dissertation explores its physical implications. Specifically, the theoretical prescription of Kyriazidou et al for creating artificial dielectric materials with a narrow band transparency is experimentally demonstrated for the first time at radiofrequencies. It is proposed that the most general component of the model of the frequency dependent permittivity of materials is not the simple harmonic oscillator but rather the harmonic oscillator augmented by the presence of a zero shunt circuit. The experimental demonstration illustrates the synthesis and design of a new generation of window materials based on that model. Physically realizable Lorentz coatings and RF Debye “molecules” for creating the desired windows material are designed using the full physics computational electromagnetic code. The prescribed material is then implemented in printed circuit board technology combined with composite manufacturing to successfully fabricate a lab demonstrator that exhibits a narrow RF window at a preselected frequency of interest. Demonstrator test data shows good agreement with HFSS predictions.Dissertation/ThesisDoctoral Dissertation Materials Science and Engineering 201

    Adaptation of multiway-merge sorting algorithm to MIMD architectures with an experimental study

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2002.Thesis (Master's) -- Bilkent University, 2002.Includes bibliographical references leaves 73-78.Sorting is perhaps one of the most widely studied problems of computing. Numerous asymptotically optimal sequential algorithms have been discovered. Asymptotically optimal algorithms have been presented for varying parallel models as well. Parallel sorting algorithms have already been proposed for a variety of multiple instruction, multiple data streams (MIMD) architectures. In this thesis, we adapt the multiwaymerge sorting algorithm that is originally designed for product networks, to MIMD architectures. It has good load balancing properties, modest communication needs and well performance. The multiway-merge sort algorithm requires only two all-to-all personalized communication (AAPC) and two one-to-one communications independent from the input size. In addition to evenly distributed load balancing, the algorithm requires only size of 2N/P local memory for each processor in the worst case, where N is the number of items to be sorted and P is the number of processors. We have implemented the algorithm on the PC Cluster that is established at Computer Engineering Department of Bilkent University. To compare the results we have implemented a sample sort algorithm (PSRS Parallel Sorting by Regular Sampling) by X. Liu et all and a parallel quicksort algorithm (HyperQuickSort) on the same cluster. In the experimental studies we have used three different benchmarks namely Uniformly, Gaussian, and Zero distributed inputs. Although the multiwaymerge algorithm did not achieve better results than the other two, which are theoretically cost optimal algorithms, there are some cases that the multiway-merge algorithm outperforms the other two like in Zero distributed input. The results of the experiments are reported in detail. The multiway-merge sort algorithm is not necessarily the best parallel sorting algorithm, but it is expected to achieve acceptable performance on a wide spectrum of MIMD architectures.Cantürk, LeventM.S

    Edge of the network device for a low power wide area network

    Get PDF
    Dissertação de mestrado em Engenharia Eletrónica Industrial e ComputadoresThe widespread of Internet connection, particularly on small devices (embedded systems), has allowed the development of the Internet of Things (IoT) concept, due to the connection of these devices to web micro services (Cloud), and has had a major role in Industry 4.0 [1]. Through the advances of wireless technologies, these devices were able to have an Internet connection, becoming available everywhere. The creation of Wireless Sensor Networks (WSNs) has enabled the use of networks composed of independent devices (nodes or edge devices), equipped with sensors and actuators, and made it possible to collect information about the environment where they are deployed [2]. The growing necessity of having a wider coverage area for Wireless Sensor Networks, along with the demanding low power requirements on devices has enabled Low Power Wide Area (LPWA) technologies to arise. These technologies are able to reach further coverage than conventional wireless technologies (such as Bluetooth, Wi-Fi, ZigBee etc), as well as raising the energy autonomy of the devices [3], which makes LPWA technologies ideal for wider areas. The recent tragedies of wildfires in Portugal, in both 2017 and 2018, had great impact on economic and social levels. Early detection and alerts about wildfires are crucial to prevent them from spreading [4]. Therefore, by using LPWA technologies in forests, a case study can be made for the wildfire occurrences in forests. Through the use of independent devices equipped with sensors, data can be collected from the environment that might detect that a fire is starting, and then send alerts to fire fighting units. In this Master’s thesis it was developed the architecture of sensor nodes, to be integrated in a Low Power Wide Area Network (LPWAN). By using the LoRa technology to achieve a long range between the sensor nodes and the network coordinator, it is possible for edge devices to collect and send data to upper levels of the network. It was possible to gather information about the environment and further understand LoRa’s potential for sending all the data to the upper levels of the network.A proliferação da conexão à Internet, especialmente em pequenos dispositivos (sistemas embebidos), permitiu o desenvolvimento do conceito Internet of Things (IoT), devido à possibilidade de ligação destes a micro serviços web (Cloud), tendo um papel crucial no desenrolar da Indústria 4.0 [1]. Tendo como principal impulsionador o avanço tecnológico das redes sem fios, foi possível ligar estes dispositivos à Internet, tornando-os acessíveis em qualquer lado. Assim, surgiram as Wireless Sensor Networks (WSNs), através da utilização de redes de dispositivos independentes (nós ou edge devices), equipados com sensores e atuadores, possibilitando a recolha de informação sobre o meio onde estão colocados [2]. A crescente necessidade de cobrir áreas cada vez maiores para este tipo de redes, associada a requisitos mais exigentes de consumo energético reduzido nos dispositivos, abriu caminho para o aparecimento das tecnologias Low Power Wide Area (LPWA). Este tipo de tecnologias consegue alcances superiores em relação às redes sem fios convencionais (Wi-Fi, Bluetooth, entre outros), permitindo maior autonomia dos nós sensores [3], tornando-se assim ideais para a sua utilização em áreas alargadas. As recentes tragédias de incêndios que ocorreram em Portugal, em particular nos anos de 2017 e 2018, tiveram grande impacto tanto a nível económico como social. A deteção e alerta precoce de incêndios são fatores cruciais para evitar a sua propagação [4]. Utilizando as tecnologias LPWA em contexto florestal poderá criar-se um caso de estudo para a ocorrência de incêndios em florestas. Através da utilização de edge devices, poderá ser possível recolher dados provenientes deste meio que indiquem a existência de um incêndio a deflagrar, e enviar alertas para as unidades de combate a incêndios. Nesta dissertação foi desenvolvida a arquitetura dos nós sensores, a serem integrados numa Low Power Wide Area Network (LPWAN). Utilizando tecnologia LoRa para obter um longo alcance entre os nós e o coordenador da rede, poderá desta forma ser possível os nós sensores recolherem e enviarem dados para as camadas superiores. Foi possível, com a utilização de sensores nos nós, recolher informações sobre o ambiente e perceber o potencial da tecnologia LoRa para o envio destes dados para as camadas superiores

    Algorithm Libraries for Multi-Core Processors

    Get PDF
    By providing parallelized versions of established algorithm libraries, we ease the exploitation of the multiple cores on modern processors for the programmer. The Multi-Core STL provides basic algorithms for internal memory, while the parallelized STXXL enables multi-core acceleration for algorithms on large data sets stored on disk. Some parallelized geometric algorithms are introduced into CGAL. Further, we design and implement sorting algorithms for huge data in distributed external memory
    corecore