2,714 research outputs found
The Design and Implementation of a PCIe-based LESS Label Switch
With the explosion of the Internet of Things, the number of smart, embedded devices has grown exponentially in the last decade, with growth projected at a commiserate rate. These devices create strain on the existing infrastructure of the Internet, creating challenges with scalability of routing tables and reliability of packet delivery. Various schemes based on Location-Based Forwarding and ID-based routing have been proposed to solve the aforementioned problems, but thus far, no solution has completely been achieved. This thesis seeks to improve current proposed LORIF routers by designing, implementing, and testing and a PCIe-based LESS switch to process unrouteable packets under the current LESS forwarding engine
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Near-sensor data analytics is a promising direction for IoT endpoints, as it
minimizes energy spent on communication and reduces network load - but it also
poses security concerns, as valuable data is stored or sent over the network at
various stages of the analytics pipeline. Using encryption to protect sensitive
data at the boundary of the on-chip analytics engine is a way to address data
security issues. To cope with the combined workload of analytics and encryption
in a tight power envelope, we propose Fulmine, a System-on-Chip based on a
tightly-coupled multi-core cluster augmented with specialized blocks for
compute-intensive data processing and encryption functions, supporting software
programmability for regular computing tasks. The Fulmine SoC, fabricated in
65nm technology, consumes less than 20mW on average at 0.8V achieving an
efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to
25MIPS/mW in software. As a strong argument for real-life flexible application
of our platform, we show experimental results for three secure analytics use
cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN
consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with
secured remote recognition in 5.74pJ/op; and seizure detection with encrypted
data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE
Transactions on Circuits and Systems - I: Regular Paper
Implicit Decomposition for Write-Efficient Connectivity Algorithms
The future of main memory appears to lie in the direction of new technologies
that provide strong capacity-to-performance ratios, but have write operations
that are much more expensive than reads in terms of latency, bandwidth, and
energy. Motivated by this trend, we propose sequential and parallel algorithms
to solve graph connectivity problems using significantly fewer writes than
conventional algorithms. Our primary algorithmic tool is the construction of an
-sized "implicit decomposition" of a bounded-degree graph on
nodes, which combined with read-only access to enables fast answers to
connectivity and biconnectivity queries on . The construction breaks the
linear-write "barrier", resulting in costs that are asymptotically lower than
conventional algorithms while adding only a modest cost to querying time. For
general non-sparse graphs on edges, we also provide the first writes
and operations parallel algorithms for connectivity and biconnectivity.
These algorithms provide insight into how applications can efficiently process
computations on large graphs in systems with read-write asymmetry
- …