57 research outputs found

    Interface-based hierarchy for synchronous data-flow graphs

    Get PDF
    International audienceDataflow has proven to be an attractive computation model for programming digital signal processing (DSP) applications. A restricted version of dataflow, termed synchronous dataflow (SDF), offers strong compile-time predictability properties, but has limited expressive power. In this paper we propose a new type of hierarchy in the SDF domain allowing more expressivity while maintaining its predictability. This new hierarchy semantic is based on interfaces that fix the number of tokens consumed/produced by a hierarchical vertex in a manner that is independent or separate from the specified internal dataflow structure of the encapsulated subsystem. This interface-based hierarchy gives the application designer more flexibility in iterative construction of hierarchical representations, and experimentation with different optimization choices at different levels of the design hierarchy

    Hyperspectral Remote Sensing of Coastal Environment

    Get PDF
    Remotely sensed earth observation (EO) has revolutionized our understanding of our dynamic environment. Hyperspectral remote sensing is, in many ways, the ultimate optical remote sensing technology. Hyperspectral remote sensing is suited especially well for environmental studies due to it’s capability to discriminate between species and quantify the abundance of different materials and chemicals. In this thesis remote sensing methods for hyperspectral data, applicable in environmental monitoring of coastal environment are developed and tested.The planning of hyperspectral flight campaign HYPE08 in South-West Finland raised the need to validate and develop data processing methods for HYPE08 as well as for future hyperspectral flight campaigns. The research presented in this thesis can be largely considered as a response to this need. The study concentrates on four main topics: wetlands mapping, benthic mapping, water quality and atmospheric correction. The study was done using airborne hyperspectral data and field spectroscopy measurements.The results of this study emphasize the importance of local calibration and validation of methods used. Water quality retrieval algorithms developed in local environmental conditions outperformed those validated elsewhere. The results also show that hyperspectral remote sensing of benthic cover types is limited to rather shallow areas, indicating the need to use state of the art methodology in order to increase operational depth range. The proposed atmospheric correction algorithm produced very good results. The accuracy of model-based algorithm increases when Empirical-Line (EL) correction using spectral field measurements was applied to spectra generated by the model

    Design and development from single core reconfigurable accelerators to a heterogeneous accelerator-rich platform

    Get PDF
    The performance of a platform is evaluated based on its ability to deal with the processing of multiple applications of different nature. In this context, the platform under evaluation can be of homogeneous, heterogeneous or of hybrid architecture. The selection of an architecture type is generally based on the set of different target applications and performance parameters, where the applications can be of serial or parallel nature. The evaluation is normally based on different performance metrics, e.g., resource/area utilization, execution time, power and energy consumption. This process can also include high-level performance metrics, e.g., Operations Per Second (OPS), OPS/Watt, OPS/Hz, Watt/Area etc. An example of architecture selection can be related to a wireless communication system where the processing of computationally-intensive signal-processing algorithms has strict execution-time constraints and in this case, a platform with special-purpose accelerators is relatively more suitable than a typical homogeneous platform. A couple of decades ago, it was expensive to plant many special-purpose accelerators on a chip as the cost per unit area was relatively higher than today. The utilization wall is also becoming a limiting factor in homogeneous multicore scaling which means that all the cores on a platform cannot be operated at their maximum frequency due to a possible thermal meltdown. In this case, some of the processing cores have to be turned-off or to be operated at very low frequencies making most of the part of the chip to stay underutilized. A possible solution lies in the use of heterogeneous multicore platforms where many application-specific cores operate at lower frequencies, therefore reducing power dissipation density and increasing other performance parameters. However, to achieve maximum flexibility in processing, a general-purpose flavor can also be introduced by adding a few Reduced Instruction-Set Computing (RISC) cores. A power class of heterogeneous multicore platforms is an accelerator-rich platform where many application-specific accelerators are loosely connected with each other for work load distribution or to execute the tasks independently. This research work spans from the design and development of three different types of template-based Coarse-Grain Reconfigurable Arrays (CGRAs), i.e., CREMA, AVATAR and SCREMA to a Heterogeneous Accelerator-Rich Platform (HARP). The accelerators generated from the three CGRAs could perform different lengths and types of Fast Fourier Transform (FFT), real and complex Matrix-Vector Multiplication (MVM) algorithms. CREMA and AVATAR were fixed CGRAs with eight and sixteen number of Processing Element (PE) columns, respectively. SCREMA could flex between four, eight, sixteen and thirty two number of PE columns. Many case studies were conducted to evaluate the performance of the reconfigurable accelerators generated from these CGRA templates. All of these CGRAs work in a processor/coprocessor model tightly integrated with a Direct Memory Access (DMA) device. Apart from these platforms, a reconfigurable Application-Specific Instruction-set Processor (rASIP) is also designed, tested for FFT execution under IEEE-802.11n timing constraints and evaluated against a processor/coprocessor model. It was designed by integrating AVATAR generated radix-(2, 4) FFT accelerator into the datapath of a RISC processor. The instruction set of the RISC processor was extended to perform additional operations related to AVATAR. As mentioned earlier, the underutilized part of the chip, now-a-days called Dark Silicon is posing many challenges for the designers. Apart from software optimizations, clock gating, dynamic voltage/frequency scaling and other high-level techniques, one way of dealing with this problem is to use many application-specific cores. In an effort to maximize the number of reconfigurable processing resources on a platform, the accelerator-rich architecture HARP was designed and evaluated in terms of different performance metrics. HARP is constructed on a Network-on-Chip (NoC) of 3x3 nodes where with every node, a CGRA of application-specific size is integrated other than the central node which is attached to a RISC processor. The RISC establishes synchronization between the nodes for data transfer and also performs the supervisory control. While using the NoC as the backbone of communication between the cores, it becomes possible for all the cores to address each other and also perform execution simultaneously and independently of each other. The performance of accelerators generated from CREMA, AVATAR and SCREMA templates were evaluated individually and also when attached to HARP's NoC nodes. The individual CGRAs show promising results in their own capacity but when integrated all together in the framework of HARP, interesting comparisons were established in terms of overall execution times, resource utilization, operating frequencies, power and energy consumption. In evaluating HARP, estimates and measurements were also made in some advanced performance metrics, e.g., in MOPS/mW and MOPS/MHz. The overall research work promotes the idea of heterogeneous accelerator-rich platform as a solution to current problems and future needs of industry and academia

    Abstracting Application Development for Resource Constrained Wireless Sensor Networks

    Get PDF
    Ubiquitous computing is a concept whereby computing is distributed across smart objects surrounding users, creating ambient intelligence. Ubiquitous applications use technologies such as the Internet, sensors, actuators, embedded computers, wireless communication, and new user interfaces. The Internet-of-Things (IoT) is one of the key concepts in the realization of ubiquitous computing, whereby smart objects communicate with each other and the Internet. Further, Wireless Sensor Networks (WSNs) are a sub-group of IoT technologies that consist of geographically distributed devices or nodes, capable of sensing and actuating the environment.WSNs typically contain tens to thousands of nodes that organize and operate autonomously to perform application-dependent sensing and sensor data processing tasks. The projected applications require nodes to be small in physical size and low-cost, and have a long lifetime with limited energy resources, while performing complex computing and communications tasks. As a result, WSNs are complex distributed systems that are constrained by communications, computing and energy resources. WSN functionality is dynamic according to the environment and application requirements. Dynamic multitasking, task distribution, task injection, and software updates are required in field experiments for possibly thousands of nodes functioning in harsh environments.The development of WSN application software requires the abstraction of computing, communication, data access, and heterogeneous sensor data sources to reduce the complexities. Abstractions enable the faster development of new applications with a better reuse of existing software, as applications are composed of high-level tasks that use the services provided by the devices to execute the application logic.The main research question of this thesis is: What abstractions are needed for application development for resource constrained WSNs? This thesis models WSN abstractions with three levels that build on top of each other: 1) node abstraction, 2) network abstraction, and 3) infrastructure abstraction. The node abstraction hides the details in the use of the sensing, communication, and processing hardware. The network abstraction specifies methods of discovering and accessing services, and distributing processing in the network. The infrastructure abstraction unifies different sensing technologies and infrastructure computing platforms.As a contribution, this thesis presents the abstraction model with a review of each abstraction level. Several designs for each of the levels are tested and verified with proofs of concept and analyses of field experiments. The resulting designs consist of an operating system kernel, a software update method, a data unification interface, and all abstraction levels combining abstraction called an embedded cloud.The presented operating system kernel has a scalable overhead and provides a programming approach similar to a desktop computer operating system with threads and processes. An over-the-air update method combines low overhead and robust software updating with application task dissemination. The data unification interface homogenizes the access to the data of heterogeneous sensor networks. A unification model is used for various use cases by mapping everything as measurements. The embedded cloud allows resource constrained WSNs to share services and data, and expand resources with other technologies. The embedded cloud allows the distributed processing of applications according to the available services. The applications are implemented as processes using a hardware independent description language that can be executed on resource constrained WSNs. The lessons of practical field experimenting are analyzed to study the importance of the abstractions. Software complexities encountered in the field experiments highlight the need for suitable abstractions.The results of this thesis are tested using proof of concept implementations on real WSN hardware which is constrained by computing power in the order of a few MIPS, memory sizes of a few kilobytes, and small sized batteries. The results will remain usable in the future, as the vast amount, tight integration, and low-cost of future IoT devices require the combination of complex computation with resource constrained platforms

    Abstracting Application Development for Resource Constrained Wireless Sensor Networks

    Get PDF
    Ubiquitous computing is a concept whereby computing is distributed across smart objects surrounding users, creating ambient intelligence. Ubiquitous applications use technologies such as the Internet, sensors, actuators, embedded computers, wireless communication, and new user interfaces. The Internet-of-Things (IoT) is one of the key concepts in the realization of ubiquitous computing, whereby smart objects communicate with each other and the Internet. Further, Wireless Sensor Networks (WSNs) are a sub-group of IoT technologies that consist of geographically distributed devices or nodes, capable of sensing and actuating the environment.WSNs typically contain tens to thousands of nodes that organize and operate autonomously to perform application-dependent sensing and sensor data processing tasks. The projected applications require nodes to be small in physical size and low-cost, and have a long lifetime with limited energy resources, while performing complex computing and communications tasks. As a result, WSNs are complex distributed systems that are constrained by communications, computing and energy resources. WSN functionality is dynamic according to the environment and application requirements. Dynamic multitasking, task distribution, task injection, and software updates are required in field experiments for possibly thousands of nodes functioning in harsh environments.The development of WSN application software requires the abstraction of computing, communication, data access, and heterogeneous sensor data sources to reduce the complexities. Abstractions enable the faster development of new applications with a better reuse of existing software, as applications are composed of high-level tasks that use the services provided by the devices to execute the application logic.The main research question of this thesis is: What abstractions are needed for application development for resource constrained WSNs? This thesis models WSN abstractions with three levels that build on top of each other: 1) node abstraction, 2) network abstraction, and 3) infrastructure abstraction. The node abstraction hides the details in the use of the sensing, communication, and processing hardware. The network abstraction specifies methods of discovering and accessing services, and distributing processing in the network. The infrastructure abstraction unifies different sensing technologies and infrastructure computing platforms.As a contribution, this thesis presents the abstraction model with a review of each abstraction level. Several designs for each of the levels are tested and verified with proofs of concept and analyses of field experiments. The resulting designs consist of an operating system kernel, a software update method, a data unification interface, and all abstraction levels combining abstraction called an embedded cloud.The presented operating system kernel has a scalable overhead and provides a programming approach similar to a desktop computer operating system with threads and processes. An over-the-air update method combines low overhead and robust software updating with application task dissemination. The data unification interface homogenizes the access to the data of heterogeneous sensor networks. A unification model is used for various use cases by mapping everything as measurements. The embedded cloud allows resource constrained WSNs to share services and data, and expand resources with other technologies. The embedded cloud allows the distributed processing of applications according to the available services. The applications are implemented as processes using a hardware independent description language that can be executed on resource constrained WSNs. The lessons of practical field experimenting are analyzed to study the importance of the abstractions. Software complexities encountered in the field experiments highlight the need for suitable abstractions.The results of this thesis are tested using proof of concept implementations on real WSN hardware which is constrained by computing power in the order of a few MIPS, memory sizes of a few kilobytes, and small sized batteries. The results will remain usable in the future, as the vast amount, tight integration, and low-cost of future IoT devices require the combination of complex computation with resource constrained platforms

    Improving 3D-Turbo Code's BER Performance with a BICM System over Rayleigh Fading Channel

    Get PDF
    Classical Turbo code suffers from high error floor due to its small Minimum Hamming Distance (MHD). Newly-proposed 3D-Turbo code can effectively increase the MHD and achieve a lower error floor by adding a rate-1 post encoder. In 3D-Turbo codes, part of the parity bits from the classical Turbo encoder are further encoded through the post encoder. In this paper, a novel Bit-Interleaved Coded Modulation (BICM) system is proposed by combining rotated mapping Quadrature Amplitude Modulation (QAM) and 3D-Turbo code to improve the Bit Error Rate (BER) performance of 3D-Turbo code over Raleigh fading channel. A key-bit protection scheme and a Two-Dimension (2D) iterative soft demodulating-decoding algorithm are developed for the proposed BICM system. Simulation results show that the proposed system can obtain about 0.8-1.0 dB gain at BER of 10^{-6}, compared with the existing BICM system with Gray mapping QAM

    Software Defined Radio Solutions for Wireless Communications Systems

    Get PDF
    Wireless technologies have been advancing rapidly, especially in the recent years. Design, implementation, and manufacturing of devices supporting the continuously evolving technologies require great efforts. Thus, building platforms compatible with different generations of standards and technologies has gained a lot of interest. As a result, software defined radios (SDRs) are investigated to offer more flexibility and scalability, and reduce the design efforts, compared to the conventional fixed-function hardware-based solutions.This thesis mainly addresses the challenges related to SDR-based implementation of today’s wireless devices. One of the main targets of most of the wireless standards has been to improve the achievable data rates, which imposes strict requirements on the processing platforms. Realizing real-time processing of high throughput signal processing algorithms using SDR-based platforms while maintaining energy consumption close to conventional approaches is a challenging topic that is addressed in this thesis.Firstly, this thesis concentrates on the challenges of a real-time software-based implementation for the very high throughput (VHT) Institute of Electrical and Electronics Engineers (IEEE) 802.11ac amendment from the wireless local area networks (WLAN) family, where an SDR-based solution is introduced for the frequency-domain baseband processing of a multiple-input multipleoutput (MIMO) transmitter and receiver. The feasibility of the implementation is evaluated with respect to the number of clock cycles and the consumed power. Furthermore, a digital front-end (DFE) concept is developed for the IEEE 802.11ac receiver, where the 80 MHz waveform is divided to two 40 MHz signals. This is carried out through time-domain digital filtering and decimation, which is challenging due to the latency and cyclic prefix (CP) budget of the receiver. Different multi-rate channelization architectures are developed, and the software implementation is presented and evaluated in terms of execution time, number of clock cycles, power, and energy consumption on different multi-core platforms.Secondly, this thesis addresses selected advanced techniques developed to realize inband fullduplex (IBFD) systems, which aim at improving spectral efficiency in today’s congested radio spectrum. IBFD refers to concurrent transmission and reception on the same frequency band, where the main challenge to combat is the strong self-interference (SI). In this thesis, an SDRbased solution is introduced, which is capable of real-time mitigation of the SI signal. The implementation results show possibility of achieving real-time sufficient SI suppression under time-varying environments using low-power, mobile-scale multi-core processing platforms. To investigate the challenges associated with SDR implementations for mobile-scale devices with limited processing and power resources, processing platforms suitable for hand-held devices are selected in this thesis work. On the baseband processing side, a very long instruction word (VLIW) processor, optimized for wireless communication applications, is utilized. Furthermore, in the solutions presented for the DFE processing and the digital SI canceller, commercial off-the-shelf (COTS) multi-core central processing units (CPUs) and graphics processing units (GPUs) are used with the aim of investigating the performance enhancement achieved by utilizing parallel processing.Overall, this thesis provides solutions to the challenges of low-power, and real-time software-based implementation of computationally intensive signal processing algorithms for the current and future communications systems

    Design of Intellectual Property-Based Hardware Blocks Integrable with Embedded RISC Processors

    Get PDF
    The main focus of this thesis is to research methods, architecture, and implementation of hardware acceleration for a Reduced Instruction Set Computer (RISC) platform. The target platform is a single-core general-purpose embedded processor (the COFFEE core) which was developed by our group at Tampere University of Technology. The COFFEE core alone cannot meet the requirements of the modern applications due to the lack of several components of which the Memory Management Unit (MMU) is one of the prominent ones. Since the MMU is one of the main requirements of today’s processors, COFFEE with no MMU was not able to run an operating system. In the design of the MMU, we employed two additional micro-Translation-Lookaside Buffers (TLBs) to speed up the translation process, as well as minimizing congestions of the data/instruction address translations with a unified TLB. The MMU is tightly-coupled with the COFFEE RISC core through the Peripheral Control Block (PCB) interface of the core. The hardware implementation, alongside some optimization techniques and post synthesis results are presented, as well.Another intention of this work is to prepare a reconfigurable platform to send and receive data packets of the next generation wireless communications. Hence, we will further discuss a recently emerged wireless modulation technique known as Non-Contiguous Orthogonal Frequency Division Multiplexing (NC-OFDM), a promising technique to alleviate spectrum scarcity problem. However, one of the primary concerns in such systems is the synchronization. To that end, we developed a reconfigurable hardware component to perform as a synchronizer. The developed module exploits Partial Reconfiguration (PR) feature in order to reconfigure itself. Eventually, we will come up with several architectural choices for systems with different limiting factors such as power consumption, operating frequency, and silicon area. The synchronizer can be loosely-coupled via one of the available co-processor slots of the target processor, the COFFEE RISC core.In addition, we are willing to improve the versatility of the COFFEE core even in industrial use cases. Hence, we developed a reconfigurable hardware component capable of operating in the Controller Area Network (CAN) protocol. In the first step of this implementation, we mainly concentrate on receiving, decoding, and extracting the data segment of a CAN-based packet. Moreover, this hardware block can reconfigure itself on-the-fly to operate on different data frames. More details regarding hardware implementation issues, as well as post synthesis results are also presented. The CAN module is loosely-coupled with the COFFEE RISC processor through one of the available co-processor block

    Turku Centre for Computer Science – Annual Report 2013

    Get PDF
    Due to a major reform of organization and responsibilities of TUCS, its role, activities, and even structures have been under reconsideration in 2013. The traditional pillar of collaboration at TUCS, doctoral training, was reorganized due to changes at both universities according to the renewed national system for doctoral education. Computer Science and Engineering and Information Systems Science are now accompanied by Mathematics and Statistics in newly established doctoral programs at both University of Turku and &Aring;bo Akademi University. Moreover, both universities granted sufficient resources to their respective programmes for doctoral training in these fields, so that joint activities at TUCS can continue. The outcome of this reorganization has the potential of proving out to be a success in terms of scientific profile as well as the quality and quantity of scientific and educational results.&nbsp; International activities that have been characteristic to TUCS since its inception continue strong. TUCS&rsquo; participation in European collaboration through EIT ICT Labs Master&rsquo;s and Doctoral School is now more active than ever. The new double degree programs at MSc and PhD level between University of Turku and Fudan University in Shaghai, P.R.China were succesfully set up and are&nbsp; now running for their first year. The joint students will add to the already international athmosphere of the ICT House.&nbsp; The four new thematic reseach programmes set up acccording to the decision by the TUCS Board have now established themselves, and a number of events and other activities saw the light in 2013. The TUCS Distinguished Lecture Series managed to gather a large audience with its several prominent speakers. The development of these and other research centre activities continue, and&nbsp; new practices and structures will be initiated to support the tradition of close academic collaboration.&nbsp; The TUCS&rsquo; slogan Where Academic Tradition Meets the Exciting Future has proven true throughout these changes. Despite of the dark clouds on the national and European economic sky, science and higher education in the field have managed to retain all the key ingredients for success. Indeed, the future of ICT and Mathematics in Turku seems exciting.</p

    Contributions to Positioning Methods on Low-Cost Devices

    Get PDF
    Global Navigation Satellite System (GNSS) receivers are common in modern consumer devices that make use of position information, e.g., smartphones and personal navigation assistants. With a GNSS receiver, a position solution with an accuracy in the order of five meters is usually available if the reception conditions are benign, but the performance degrades rapidly in less favorable environments and, on the other hand, a better accuracy would be beneficial in some applications. This thesis studies advanced methods for processing the measurements of low-cost devices that can be used for improving the positioning performance. The focus is on GNSS receivers and microelectromechanical (MEMS) inertial sensors which have become common in mobile devices such as smartphones. First, methods to compensate for the additive bias of a MEMS gyroscope are investigated. Both physical slewing of the sensor and mathematical modeling of the bias instability process are considered. The use of MEMS inertial sensors for pedestrian navigation indoors is studied in the context of map matching using a particle filter. A high-sensitivity GNSS receiver is used to produce coarse initialization information for the filter to decrease the computational burden without the need to exploit local building infrastructure. Finally, a cycle slip detection scheme for stand-alone single-frequency GNSS receivers is proposed. Experimental results show that even a MEMS gyroscope can reach an accuracy suitable for North seeking if the measurement errors are carefully modeled and eliminated. Furthermore, it is seen that even a relatively coarse initialization can be adequate for long-term indoor navigation without an excessive computational burden if a detailed map is available. The cycle slip detection results suggest that even small cycle slips can be detected with mass-market GNSS receivers, but the detection rate needs to be improved
    corecore