42 research outputs found
Exploring Adaptive Implementation of On-Chip Networks
As technology geometries have shrunk to the deep submicron regime, the communication delay and power consumption of global interconnections in high performance Multi- Processor Systems-on-Chip (MPSoCs) are becoming a major bottleneck. The Network-on- Chip (NoC) architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication issues such as performance limitations of long interconnects and integration of large number of Processing Elements (PEs) on a chip. The choice of routing protocol and NoC structure can have a significant impact on performance and power consumption in on-chip networks. In addition, building a high performance, area and energy efficient on-chip network for multicore architectures requires a novel on-chip router allowing a larger network to be integrated on a single die with reduced power consumption. On top of that, network interfaces are employed to decouple computation resources from communication resources, to provide the synchronization between them, and to achieve backward compatibility with existing IP cores.
Three adaptive routing algorithms are presented as a part of this thesis. The first presented routing protocol is a congestion-aware adaptive routing algorithm for 2D mesh NoCs which does not support multicast (one-to-many) traffic while the other two protocols are adaptive routing models supporting both unicast (one-to-one) and multicast traffic. A streamlined on-chip router architecture is also presented for avoiding congested areas in 2D mesh NoCs via employing efficient input and output selection. The output selection utilizes an adaptive routing algorithm based on the congestion condition of neighboring routers while the input selection allows packets to be serviced from each input port according to its congestion level. Moreover, in order to increase memory parallelism and bring compatibility with existing IP cores in network-based multiprocessor architectures, adaptive network interface architectures are presented to use multiple SDRAMs which can be accessed simultaneously. In addition, a smart memory controller is integrated in the adaptive network interface to improve the memory utilization and reduce both memory and network latencies.
Three Dimensional Integrated Circuits (3D ICs) have been emerging as a viable candidate to achieve better performance and package density as compared to traditional 2D ICs. In addition, combining the benefits of 3D IC and NoC schemes provides a significant performance gain for 3D architectures. In recent years, inter-layer communication across multiple stacked layers (vertical channel) has attracted a lot of interest. In this thesis, a novel adaptive pipeline bus structure is proposed for inter-layer communication to improve the performance by reducing the delay and complexity of traditional bus arbitration. In addition, two mesh-based topologies for 3D architectures are also introduced to mitigate the inter-layer footprint and power dissipation on each layer with a small performance penalty.Siirretty Doriast
Contrastive Learning for Lane Detection via Cross-Similarity
Detecting road lanes is challenging due to intricate markings vulnerable to
unfavorable conditions. Lane markings have strong shape priors, but their
visibility is easily compromised. Factors like lighting, weather, vehicles,
pedestrians, and aging colors challenge the detection. A large amount of data
is required to train a lane detection approach that can withstand natural
variations caused by low visibility. This is because there are numerous lane
shapes and natural variations that exist. Our solution, Contrastive Learning
for Lane Detection via cross-similarity (CLLD), is a self-supervised learning
method that tackles this challenge by enhancing lane detection models
resilience to real-world conditions that cause lane low visibility. CLLD is a
novel multitask contrastive learning that trains lane detection approaches to
detect lane markings even in low visible situations by integrating local
feature contrastive learning (CL) with our new proposed operation
cross-similarity. Local feature CL focuses on extracting features for small
image parts, which is necessary to localize lane segments, while
cross-similarity captures global features to detect obscured lane segments
using their surrounding. We enhance cross-similarity by randomly masking parts
of input images for augmentation. Evaluated on benchmark datasets, CLLD
outperforms state-of-the-art contrastive learning, especially in
visibility-impairing conditions like shadows. Compared to supervised learning,
CLLD excels in scenarios like shadows and crowded scenes.Comment: 10 page
PR-DARTS: Pruning-Based Differentiable Architecture Search
The deployment of Convolutional Neural Networks (CNNs) on edge devices is
hindered by the substantial gap between performance requirements and available
processing power. While recent research has made large strides in developing
network pruning methods for reducing the computing overhead of CNNs, there
remains considerable accuracy loss, especially at high pruning ratios.
Questioning that the architectures designed for non-pruned networks might not
be effective for pruned networks, we propose to search architectures for
pruning methods by defining a new search space and a novel search objective. To
improve the generalization of the pruned networks, we propose two novel
PrunedConv and PrunedLinear operations. Specifically, these operations mitigate
the problem of unstable gradients by regularizing the objective function of the
pruned networks. The proposed search objective enables us to train architecture
parameters regarding the pruned weight elements. Quantitative analyses
demonstrate that our searched architectures outperform those used in the
state-of-the-art pruning networks on CIFAR-10 and ImageNet. In terms of
hardware effectiveness, PR-DARTS increases MobileNet-v2's accuracy from 73.44%
to 81.35% (+7.91% improvement) and runs 3.87 faster.Comment: 18 pages with 11 figure
Improving motion safety and efficiency of intelligent autonomous swarm of drones
Interest is growing in the use of autonomous swarms of drones in various mission-physical applications such as surveillance, intelligent monitoring, and rescue operations. Swarm systems should fulfill safety and efficiency constraints in order to guarantee dependable operations. To maximize motion safety, we should design the swarm system in such a way that drones do not collide with each other and/or other objects in the operating environment. On other hand, to ensure that the drones have sufficient resources to complete the required task reliably, we should also achieve efficiency while implementing the mission, by minimizing the travelling distance of the drones. In this paper, we propose a novel integrated approach that maximizes motion safety and efficiency while planning and controlling the operation of the swarm of drones. To achieve this goal, we propose a novel parallel evolutionary-based swarm mission planning algorithm. The evolutionary computing allows us to plan and optimize the routes of the drones at the run-time to maximize safety while minimizing travelling distance as the efficiency objective. In order to fulfill the defined constraints efficiently, our solution promotes a holistic approach that considers the whole design process from the definition of formal requirements through the software development. The results of benchmarking demonstrate that our approach improves the route efficiency by up to 10% route efficiency without any crashes in controlling swarms compared to state-of-the-art solutions. </p
Path-Based partitioning methods for 3D Networks-on-Chip with minimal adaptive routing
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Combining the benefits of 3D ICs and Networks-on-Chip (NoCs) schemes provides a significant performance gain in Chip Multiprocessors (CMPs) architectures. As multicast communication is commonly used in cache coherence protocols for CMPs and in various parallel applications, the performance of these systems can be significantly improved if multicast operations are supported at the hardware level. In this paper, we present several partitioning methods for the path-based multicast approach in 3D mesh-based NoCs, each with different levels of efficiency. In addition, we develop novel analytical models for unicast and multicast traffic to explore the efficiency of each approach. In order to distribute the unicast and multicast traffic more efficiently over the network, we propose the Minimal and Adaptive Routing (MAR) algorithm for the presented partitioning methods. The analytical and experimental results show that an advantageous method named Recursive Partitioning (RP) outperforms the other approaches. RP recursively partitions the network until all partitions contain a comparable number of switches and thus the multicast traffic is equally distributed among several subsets and the network latency is considerably decreased. The simulation results reveal that the RP method can achieve performance improvement across all workloads while performance can be further improved by utilizing the MAR algorithm. Nineteen percent average and 42 percent maximum latency reduction are obtained on SPLASH-2 and PARSEC benchmarks running on a 64-core CMP.Ebrahimi, M.; Daneshtalab, M.; Liljeberg, P.; Plosila, J.; Flich Cardo, J.; Tenhunen, H. (2014). Path-Based partitioning methods for 3D Networks-on-Chip with minimal adaptive routing. IEEE Transactions on Computers. 63(3):718-733. doi:10.1109/TC.2012.255S71873363
A novel highly adaptive routing for networks-on-chip
The degree of adaptiveness has a major impact on the performance of an adaptive routing method. This research work presents a novel turn model based routing method that provides a high degree of adaptiveness for 2D mesh. The result is that the proposed method reduces restrictions on the routing turns significantly and hence can provide path diversity using additional routes (both minimal and non-minimal). Experimental results show that the proposed method provides better performance (average latency and throughput) in comparison with the recent routing methods