12 research outputs found
A Contribution to Resource-Aware Architectures for Humanoid Robots
The goal of this work is to provide building blocks for resource-aware robot architectures. The topic of these blocks are data-driven generation of context-sensitive resource models, prediction of future resource utilizations, and resource-aware computer vision and motion planning algorithms. The implementation of these algorithms is based on resource-aware concepts and methodologies originating from the Transregional Collaborative Research Center "Invasive Computing" (SFB/TR 89)
Novel Hybrid-Learning Algorithms for Improved Millimeter-Wave Imaging Systems
Increasing attention is being paid to millimeter-wave (mmWave), 30 GHz to 300
GHz, and terahertz (THz), 300 GHz to 10 THz, sensing applications including
security sensing, industrial packaging, medical imaging, and non-destructive
testing. Traditional methods for perception and imaging are challenged by novel
data-driven algorithms that offer improved resolution, localization, and
detection rates. Over the past decade, deep learning technology has garnered
substantial popularity, particularly in perception and computer vision
applications. Whereas conventional signal processing techniques are more easily
generalized to various applications, hybrid approaches where signal processing
and learning-based algorithms are interleaved pose a promising compromise
between performance and generalizability. Furthermore, such hybrid algorithms
improve model training by leveraging the known characteristics of radio
frequency (RF) waveforms, thus yielding more efficiently trained deep learning
algorithms and offering higher performance than conventional methods. This
dissertation introduces novel hybrid-learning algorithms for improved mmWave
imaging systems applicable to a host of problems in perception and sensing.
Various problem spaces are explored, including static and dynamic gesture
classification; precise hand localization for human computer interaction;
high-resolution near-field mmWave imaging using forward synthetic aperture
radar (SAR); SAR under irregular scanning geometries; mmWave image
super-resolution using deep neural network (DNN) and Vision Transformer (ViT)
architectures; and data-level multiband radar fusion using a novel
hybrid-learning architecture. Furthermore, we introduce several novel
approaches for deep learning model training and dataset synthesis.Comment: PhD Dissertation Submitted to UTD ECE Departmen
On hard real-time scheduling of cyclo-static dataflow and its application in system-level design
This dissertation addresses the problem of designing hard real-time streaming systems running a set of parallel streaming programs in an automated way such that the programs provably meet their timing requirements. A scheduling framework is proposed with which it is analytically proven that any streaming program, modeled as an acyclic Cyclo-Static Dataflow (CSDF) graph, can be executed as a set of real-time periodic tasks. The proposed framework computes the parameters of the periodic tasks corresponding to the graph actors and the minimum buffer sizes of the communication channels such that a valid periodic schedule is guaranteed to exist. In order to demonstrate the effectiveness of the proposed scheduling framework, a system-level design flow that incorporates the scheduling framework is proposed. This proposed design flow accepts, as input, algorithmic sequential specifications of streaming programs, and then applies a set of systematic and automated steps that produce, as output, the final system implementation, which provably meets the timing requirements of the programs. The final system implementation consists of the parallelized versions of the input streaming programs together with the hardware needed to run them. The proposed scheduling framework and design flow are evaluated through a set of experiments. These experiments illustrate the effectiveness of the proposed scheduling framework and design flow.Computer Systems, Imagery and Medi
Modeling Algorithm Performance on Highly-threaded Many-core Architectures
The rapid growth of data processing required in various arenas of computation over the past decades necessitates extensive use of parallel computing engines. Among those, highly-threaded many-core machines, such as GPUs have become increasingly popular for accelerating a diverse range of data-intensive applications. They feature a large number of hardware threads with low-overhead context switches to hide the memory access latencies and therefore provide high computational throughput. However, understanding and harnessing such machines places great challenges on algorithm designers and performance tuners due to the complex interaction of threads and hierarchical memory subsystems of these machines. The achieved performance jointly depends on the parallelism exploited by the algorithm, the effectiveness of latency hiding, and the utilization of multiprocessors (occupancy). Contemporary work tries to model the performance of GPUs from various aspects with different emphasis and granularity. However, no model considers all of these factors together at the same time.
This dissertation presents an analytical framework that jointly addresses parallelism, latency-hiding, and occupancy for both theoretical and empirical performance analysis of algorithms on highly-threaded many-core machines so that it can guide both algorithm design and performance tuning. In particular, this framework not only helps to explore and reduce the runtime configuration space for tuning kernel execution on GPUs, but also reflects performance bottlenecks and predicts how the runtime will trend as the problem and other parameters scale. The framework consists of a pair of analytical models with one focusing on higher-level asymptotic algorithm performance on GPUs and the other one emphasizing lower-level details about scheduling and runtime configuration. Based on the two models, we have conducted extensive analysis of a large set of algorithms. Two analysis provides interesting results and explains previously unexplained data. In addition, the two models are further bridged and combined as a consistent framework. The framework is able to provide an end-to-end methodology for algorithm design, evaluation, comparison, implementation, and prediction of real runtime on GPUs fairly accurately.
To demonstrate the viability of our methods, the models are validated through data from implementations of a variety of classic algorithms, including hashing, Bloom filters, all-pairs shortest path, matrix multiplication, FFT, merge sort, list ranking, string matching via suffix tree/array, etc. We evaluate the models\u27 performance across a wide spectrum of parameters, data values, and machines. The results indicate that the models can be effectively used for algorithm performance analysis and runtime prediction on highly-threaded many-core machines
Intelligent Transportation Related Complex Systems and Sensors
Building around innovative services related to different modes of transport and traffic management, intelligent transport systems (ITS) are being widely adopted worldwide to improve the efficiency and safety of the transportation system. They enable users to be better informed and make safer, more coordinated, and smarter decisions on the use of transport networks. Current ITSs are complex systems, made up of several components/sub-systems characterized by time-dependent interactions among themselves. Some examples of these transportation-related complex systems include: road traffic sensors, autonomous/automated cars, smart cities, smart sensors, virtual sensors, traffic control systems, smart roads, logistics systems, smart mobility systems, and many others that are emerging from niche areas. The efficient operation of these complex systems requires: i) efficient solutions to the issues of sensors/actuators used to capture and control the physical parameters of these systems, as well as the quality of data collected from these systems; ii) tackling complexities using simulations and analytical modelling techniques; and iii) applying optimization techniques to improve the performance of these systems. It includes twenty-four papers, which cover scientific concepts, frameworks, architectures and various other ideas on analytics, trends and applications of transportation-related data
The 2011 International Planning Competition
After a 3 years gap, the 2011 edition of the IPC involved a total of 55 planners,
some of them versions of the same planner, distributed among four tracks: the sequential
satisficing track (27 planners submitted out of 38 registered), the sequential multicore
track (8 planners submitted out of 12 registered), the sequential optimal track (12
planners submitted out of 24 registered) and the temporal satisficing track (8 planners
submitted out of 14 registered). Three more tracks were open to participation: temporal
optimal, preferences satisficing and preferences optimal. Unfortunately the number of submitted planners did not allow these tracks to be finally included in the competition.
A total of 55 people were participating, grouped in 31 teams. Participants came
from Australia, Canada, China, France, Germany, India, Israel, Italy, Spain, UK and
USA.
For the sequential tracks 14 domains, with 20 problems each, were selected, while
the temporal one had 12 domains, also with 20 problems each. Both new and past
domains were included. As in previous competitions, domains and problems were
unknown for participants and all the experimentation was carried out by the organizers.
To run the competition a cluster of eleven 64-bits computers (Intel XEON 2.93 Ghz
Quad core processor) using Linux was set up. Up to 1800 seconds, 6 GB of RAM memory and 750 GB of hard disk were available for each planner to solve a problem. This resulted in 7540 computing hours (about 315 days), plus a high number of hours devoted to preliminary experimentation with new domains, reruns and bugs fixing.
The detailed results of the competition, the software used for automating most
tasks, the source code of all the participating planners and the description of domains and problems can be found at the competitionâs web page:
http://www.plg.inf.uc3m.es/ipc2011-deterministicThis booklet summarizes the participants on the Deterministic Track of the International
Planning Competition (IPC) 2011. Papers describing all the participating planners
are included