224 research outputs found
Towards efficient on-board deployment of DNNs on intelligent autonomous systems
With their unprecedented performance in major AI tasks, deep neural networks (DNNs) have emerged as a primary building block in modern autonomous systems. Intelligent systems such as drones, mobile robots and driverless cars largely base their perception, planning and application-specific tasks on DNN models. Nevertheless, due to the nature of these applications, such systems require on-board local processing in order to retain their autonomy and meet latency and throughput constraints. In this respect, the large computational and memory demands of DNN workloads pose a significant barrier on their deployment on the resource-and power-constrained compute platforms that are available on-board. This paper presents an overview of recent methods and hardware architectures that address the system-level challenges of modern DNN-enabled autonomous systems at both the algorithmic and hardware design level. Spanning from latency-driven approximate computing techniques to high-throughput mixed-precision cascaded classifiers, the presented set of works paves the way for the on-board deployment of sophisticated DNN models on robots and autonomous systems
A throughput-latency co-optimised cascade of convolutional neural network classifiers
Convolutional Neural Networks constitute a promi-nent AI model for classification tasks, serving a broad span ofdiverse application domains. To enable their efficient deploymentin real-world tasks, the inherent redundancy of CNNs is fre-quently exploited to eliminate unnecessary computational costs.Driven by the fact that not all inputs require the same amount ofcomputation to drive a confident prediction, multi-precision cas-cade classifiers have been recently introduced. FPGAs comprise apromising platform for the deployment of such input-dependentcomputation models, due to their enhanced customisation ca-pabilities. Current literature, however, is limited to throughput-optimised cascade implementations, employing large batching atthe expense of a substantial latency aggravation prohibiting theirdeployment on real-time scenarios. In this work, we introduce anovel methodology for throughput-latency co-optimised cascadedCNN classification, deployed on a custom FPGA architecturetailored to the target application and deployment platform,with respect to a set of user-specified requirements on accuracyand performance. Our experiments indicate that the proposedapproach achieves comparable throughput gains with relatedstate-of-the-art works, under substantially reduced overhead inlatency, enabling its deployment on latency-sensitive applications
Large-distance behaviour of the graviton two-point function in de Sitter spacetime
It is known that the graviton two-point function for the de Sitter invariant
"Euclidean" vacuum in a physical gauge grows logarithmically with distance in
spatially-flat de Sitter spacetime. We show that this logarithmic behaviour is
a gauge artifact by explicitly demonstrating that the same behaviour can be
reproduced by a pure-gauge two-point function.Comment: 19 pages, no figures, misprints and minor errors correcte
A Note on Gradient/Fractional One-Dimensional Elasticity and Viscoelasticity
An introductory discussion on a (weakly non-local) gradient generalization of some one-dimensional elastic and viscoelastic models, and their fractional extension is provided. Emphasis is placed on the possible implications of micro-and nano-engineering problems, including small-scale structural mechanics and composite materials, as well as collagen biomechanics and nanomaterials
Recommended from our members
Seismic retrofitting and health monitoring of school buildings of Cyprus
The vulnerability of existing buildings to seismic forces and their retrofitting is an international problem. The majority of structures in seismic-prone areas worldwide are structures that have been designed either without the consideration of seismic forces, or with previous codes of practice specifying lower levels of seismic forces. In Cyprus, after the three earthquakes that occurred in 1995, 1996, and 1999, the Cyprus State, acting in a pioneering way internationally, has decided the seismic retrofitting of all school buildings, taking into account the sensitivity of the society towards these structures, which house the future generation of the society. In this paper the overall assessment methodology is presented, along with details of the over 10 year ongoing retrofitting program of the school buildings of Cyprus, with emphasis on the description of the program and the development of a wireless monitoring system. In addition, mathematical models of selected school buildings are presented and comparison is made with in-situ measurement
On the scalar sector of the covariant graviton two-point function in de Sitter spacetime
We examine the scalar sector of the covariant graviton two-point function in
de Sitter spacetime. This sector consists of the pure-trace part and another
part described by a scalar field. We show that it does not contribute to
two-point functions of gauge-invariant quantities. We also demonstrate that the
long-distance growth present in some gauges is absent in this sector for a wide
range of gauge parameters.Comment: 15 pages, no figures, LaTeX, considerably shortene
HAPI: Hardware-Aware Progressive Inference
Convolutional neural networks (CNNs) have recently become the
state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN
inference still comes at a high computational cost. A growing body of work aims
to alleviate this by exploiting the difference in the classification difficulty
among samples and early-exiting at different stages of the network.
Nevertheless, existing studies on early exiting have primarily focused on the
training scheme, without considering the use-case requirements or the
deployment platform. This work presents HAPI, a novel methodology for
generating high-performance early-exit networks by co-optimising the placement
of intermediate exits together with the early-exit strategy at inference time.
Furthermore, we propose an efficient design space exploration algorithm which
enables the faster traversal of a large number of alternative architectures and
generates the highest-performing design, tailored to the use-case requirements
and target hardware. Quantitative evaluation shows that our system consistently
outperforms alternative search mechanisms and state-of-the-art early-exit
schemes across various latency budgets. Moreover, it pushes further the
performance of highly optimised hand-crafted early-exit CNNs, delivering up to
5.11x speedup over lightweight models on imposed latency-driven SLAs for
embedded devices.Comment: Accepted at the 39th International Conference on Computer-Aided
Design (ICCAD), 202
Stochastic Dynamic Analysis of Cultural Heritage Towers up to Collapse
This paper deals with the seismic vulnerability of monumental unreinforced masonry (URM) towers, the fragility of which has not yet been sufficiently studied. Thus, the present paper fills this gap by developing models to investigate the seismic response of URM towers up to collapse. On mount Athos, Greece, there exist more than a hundred medieval towers, having served mainly as campaniles or fortifications. Eight representative towers were selected for a thorough investigation to estimate their seismic response characteristics. Their history and architectural features are initially discussed and a two-step analysis follows: (i) limit analysis is performed to estimate the collapse mechanism and the locations of critical cracks, (ii) non-linear explicit dynamic analyses are then carried out, developing finite element (FE) simulations, with cracks modelled as interfacial surfaces to derive the capacity curves. A meaningful definition of the damage states is proposed based on the characteristics of their capacity curves, with the ultimate limit state related to collapse. The onset of slight damage-state is characterised by the formation and development of cracks responsible for the collapse mechanism of the structure. Apart from these two, another two additional limit states are also specified: the moderate damage-state and the extensive one. Fragility and vulnerability curves are finally generated which can help the assessment and preservation of cultural heritage URM towers
SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud
Despite the soaring use of convolutional neural networks (CNNs) in mobile
applications, uniformly sustaining high-performance inference on mobile has
been elusive due to the excessive computational demands of modern CNNs and the
increasing diversity of deployed devices. A popular alternative comprises
offloading CNN processing to powerful cloud-based servers. Nevertheless, by
relying on the cloud to produce outputs, emerging mission-critical and
high-mobility applications, such as drone obstacle avoidance or interactive
applications, can suffer from the dynamic connectivity conditions and the
uncertain availability of the cloud. In this paper, we propose SPINN, a
distributed inference system that employs synergistic device-cloud computation
together with a progressive inference method to deliver fast and robust CNN
inference across diverse settings. The proposed system introduces a novel
scheduler that co-optimises the early-exit policy and the CNN splitting at run
time, in order to adapt to dynamic conditions and meet user-defined
service-level requirements. Quantitative evaluation illustrates that SPINN
outperforms its state-of-the-art collaborative inference counterparts by up to
2x in achieved throughput under varying network conditions, reduces the server
cost by up to 6.8x and improves accuracy by 20.7% under latency constraints,
while providing robust operation under uncertain connectivity conditions and
significant energy savings compared to cloud-centric execution.Comment: Accepted at the 26th Annual International Conference on Mobile
Computing and Networking (MobiCom), 202
- …