8 research outputs found

    FPGA-based architectures for acoustic beamforming with microphone arrays : trends, challenges and research opportunities

    Get PDF
    Over the past decades, many systems composed of arrays of microphones have been developed to satisfy the quality demanded by acoustic applications. Such microphone arrays are sound acquisition systems composed of multiple microphones used to sample the sound field with spatial diversity. The relatively recent adoption of Field-Programmable Gate Arrays (FPGAs) to manage the audio data samples and to perform the signal processing operations such as filtering or beamforming has lead to customizable architectures able to satisfy the most demanding computational, power or performance acoustic applications. The presented work provides an overview of the current FPGA-based architectures and how FPGAs are exploited for different acoustic applications. Current trends on the use of this technology, pending challenges and open research opportunities on the use of FPGAs for acoustic applications using microphone arrays are presented and discussed

    A Distributed Processing Platform With Reconfigurable Autonomous Nodes

    Full text link
    Distributed processing is a fast growing area of interest due to the exploding popularity of Internet of Things (IoT) and Unmanned Aerial Vehicles (UAV) technologies. IoT is a distributed processing structure by nature, while UAVs evolve from single-UAV applications towards multiple-UAV (teams). The demand for processing capabilities is expanding as well. The general purpose processors (e.g. CPUs) can be used for any type of application, however this flexibility is at the cost of operational efficiency. Application Specific Integrated Circuits (ASICs) are designed for certain types of application and have great operational efficiency, but they rarely can be used for other applications. The reconfigurable chips – Field Programmable Gate Arrays (FPGAs) provide high operational efficiency along with the application flexibility – as they can be reprogrammed with the functionality that is required at the given time. All the above listed aspects are combined in the distributed processing system that is expected to consume low amount of electrical energy. This dissertation proposes a comprehensive solution for the problem of distributed processing equipped with reconfigurable units. The complete and detailed architecture is provided for each element. The design includes operational algorithms that together with the architecture constitute a complete solution for the stated problem. The design of the units is flexible and allows any number and combination of CPUs, ASICs or FPGAs. Units in the proposed design are autonomous – the decisions are taken by individual units, instead of the central node, which is marginalized. The decentralized and autonomous approach provides more flexible and reliable design that is especially important for IoT and teamed UAV applications. The efficiency of the proposed solutions is defined as electrical energy consumption and operation timespan, and is measured using dedicated experimentation system through numerous simulations

    Characterization and optimization of network traffic in cortical simulation

    Get PDF
    Considering the great variety of obstacles the Exascale systems have to face in the next future, a deeper attention will be given in this thesis to the interconnect and the power consumption. The data movement challenge involves the whole hierarchical organization of components in HPC systems — i.e. registers, cache, memory, disks. Running scientific applications needs to provide the most effective methods of data transport among the levels of hierarchy. On current petaflop systems, memory access at all the levels is the limiting factor in almost all applications. This drives the requirement for an interconnect achieving adequate rates of data transfer, or throughput, and reducing time delays, or latency, between the levels. Power consumption is identified as the largest hardware research challenge. The annual power cost to operate the system would be above 2.5 B$ per year for an Exascale system using current technology. The research for alternative power-efficient computing device is mandatory for the procurement of the future HPC systems. In this thesis, a preliminary approach will be offered to the critical process of co-design. Co-desing is defined as the simultaneos design of both hardware and software, to implement a desired function. This process both integrates all components of the Exascale initiative and illuminates the trade-offs that must be made within this complex undertaking

    Pathfinding Algorithm Optimization Via Evolution

    Get PDF
    Pathfinding is a popular computer science problem in both academic research and industrial development. The objective of pathfinding is to search for a path, often the shortest path, from one location to another on a graph. Many real world applications can be considered as pathfinding problems, including motion planning, video games, logistics, and decision making. Computer scientists have proposed different algorithms to efficiently search for the shortest path. A* search algorithm is the de facto pathfinding algorithm that uses a heuristic function to determine the best action to take based on the given information. It is the most popular pathfinding algorithm due to its simplicity and efficiency. The performance of A* is heavily dependent on the quality of the heuristic function. The heuristic function determines the search speed, accuracy, and memory consumption. Hence, designing good heuristic functions for specific domains becomes the primary research focus on pathfinding algorithm optimization. In this dissertation, we address and solve several commonly known challenges in pathfinding problems and A* algorithm. First, designing new heuristic functions is a difficult and time-consuming task, especially when they are used to solve complex problems. The task requires the user to have expert knowledge of the problem. Moreover, a single heuristic function might not be enough to digest all the provided information and return the best guidance during the search. Previous works suggest that multiple heuristics for complex problems can dramatically speed up the search. However, choosing the appropriate combination of heuristic functions is tricky. Current optimization approaches rely on hand-tuning the parameters via trial and error by engineers over many iterations. There is a need to reduce the difficulty of designing heuristic functions for search performance maximization. Our first contribution is to propose an improved A* with a self-evolving heuristic function named Evolutionary Heuristic A* (EHA*) that reduces engineering effort to design the heuristic function for A* and maximize the search performance. Our experiment results show that EHA* (i) preserves path optimality; (ii) is not limited to a particular application; (iii) speeds up the path searching process; and (iv) most importantly, dramatically reduces the difficulty for software engineers to design heuristic functions for A* search. Moreover, our work can be applied to other existing works on the performance improvement of A* search. Search, A* search suffers from poor performance on large search spaces. Although EHA* improves the quality of heuristic functions, large search space still leads to many unnecessary searches. Our second contribution is Regions Discovery Algorithm (RDA), a map clustering technique to partition a grid based map into different categories to reduce search spaces and increase search speed. Our approach reduces the size of search spaces by partitioning a graph into many segments and identifying the segments by their characteristics. By identifying segments in different categories, we can easily eliminate search space, such as rooms, that are not possible (better use needed?) to be part of the optimal solution. Unlike the existing approaches that might result in non-optimal solutions, our experiment results show that RDA guarantees optimal solutions. Our third contribution, the Hierarchical Evolutionary Heuristic A* (HEHA*), further improves the search ability of handling complex pathfinding problems and boosting the search performance, by reducing search spaces and exploiting parallelism techniques. HEHA* combines the strength of EHA* and RDA to reduce search spaces and improve search speed. HEHA* shows that it provides better search performance with less memory consumption. In the pre-processing phase, first HEHA* partitions a graph into different segments and then applies different optimized heuristic functions for each segment to maximize the search performance. During the online process, HEHA* searches on the abstract level first to reduce search area, and exploits parallelism to speed up the search. Fourth, we improve and apply HEHA* to Multi-Agent Pathfinding (MAPF) problems. MAPF is the fundamental problem of many robotic and logistic applications, where the main constraint is that all agents can find the shortest paths while not colliding with each other. While the current trend favors the central controlled system, our approach is to develop a distributed version of HEHA* that can efficiently plan the optimal path for each agent. Such a system requires data sharing and exchanging among the agents, so that each agent can make its own decision without a supervising system. Our experiment results show that the Multi-Agent version of HEHA* maintains a high success rate when the number of agents increases. While EHA* and HEHA* provide a novel approach for heuristic function design, the pre-processing times are not trivial. To boost the performance of the preprocessing steps in EHA* and HEHA*, we propose a FPGA-based reconfigurable hardware accelerator that is not bound to any specific applications as our fifth contribution. Since GA requires many independent processes, it is suitable to implement it in a hardware accelerator to gain maximum performance. We apply the following techniques to enhance performance: deep pipelining, reconfigurable computing, massive parallel processing, and degree of parallelism maximization. Our results show that the FPGA accelerator for EHA* improves the scalability, throughput, and latency

    Runtime methods for energy-efficient, image processing using significance driven learning.

    Get PDF
    Ph. D. Thesis.Image and Video processing applications are opening up a whole range of opportunities for processing at the "edge" or IoT applications as the demand for high accuracy processing high resolution images increases. However this comes with an increase in the quantity of data to be processed and stored, thereby causing a significant increase in the computational challenges. There is a growing interest in developing hardware systems that provide energy efficient solutions to this challenge. The challenges in Image Processing are unique because the increase in resolution, not only increases the data to be processed but also the amount of information detail scavenged from the data is also greatly increased. This thesis addresses the concept of extracting the significant image information to enable processing the data intelligently within a heterogeneous system. We propose a unique way of defining image significance, based on what causes us to react when something "catches our eye", whether it be static or dynamic, whether it be in our central field of focus or our peripheral vision. This significance technique proves to be a relatively economical process in terms of energy and computational effort. We investigate opportunities for further computational and energy efficiency that are available by elective use of heterogeneous system elements. We utilise significance to adaptively select regions of interest for selective levels of processing dependent on their relative significance. We further demonstrate that exploiting the computational slack time released by this process, we can apply throttling of the processor speed to effect greater energy savings. This demonstrates a reduction in computational effort and energy efficiency a process that we term adaptive approximate computing. We demonstrate that our approach reduces energy in a range of 50 to 75%, dependent on user quality demand, for a real-time performance requirement of 10 fps for a WQXGA image, when compared with the existing approach that is agnostic of significance. We further hypothesise that by use of heterogeneous elements that savings up to 90% could be achievable in both performance and energy when compared with running OpenCV on the CPU alone
    corecore