117,455 research outputs found
LSDA: Large Scale Detection Through Adaptation
A major challenge in scaling object detection is the difficulty of obtaining
labeled images for large numbers of categories. Recently, deep convolutional
neural networks (CNNs) have emerged as clear winners on object classification
benchmarks, in part due to training with 1.2M+ labeled classification images.
Unfortunately, only a small fraction of those labels are available for the
detection task. It is much cheaper and easier to collect large quantities of
image-level labels from search engines than it is to collect detection data and
label it with precise bounding boxes. In this paper, we propose Large Scale
Detection through Adaptation (LSDA), an algorithm which learns the difference
between the two tasks and transfers this knowledge to classifiers for
categories without bounding box annotated data, turning them into detectors.
Our method has the potential to enable detection for the tens of thousands of
categories that lack bounding box annotations, yet have plenty of
classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge
demonstrates the efficacy of our approach. This algorithm enables us to produce
a >7.6K detector by using available classification data from leaf nodes in the
ImageNet tree. We additionally demonstrate how to modify our architecture to
produce a fast detector (running at 2fps for the 7.6K detector). Models and
software are available a
Recommended from our members
Testing based on the RELAY model of error detection
RELAY, a model for error detection, defines revealing conditions that guarantee that a fault originates an error during execution and that the error transfers through computations and data flow until it is revealed. This model of error detection provides a fault-based criterion for test data selection. The model is applied by choosing a fault classification, instantiating the conditions for the classes of faults, and applying them to the program being tested. Such an application guarantees the detection of errors caused by any fault of the chosen classes. As a formal mode of error detection, RELAY provides the basis for an automated testing tool. This paper presents the concepts behind RELAY, describes why it is better than other fault-based testing criteria, and discusses how RELAY could be used as the foundation for a testing system
Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator
Many FPGAs vendors have recently included embedded
processors in their devices, like Xilinx with ARM-Cortex
A cores, together with programmable logic cells. These devices
are known as Programmable System on Chip (PSoC). Their ARM
cores (embedded in the processing system or PS) communicates
with the programmable logic cells (PL) using ARM-standard AXI
buses. In this paper we analyses the performance of exhaustive
data transfers between PS and PL for a Xilinx Zynq FPGA
in a co-design real scenario for Convolutional Neural Networks
(CNN) accelerator, which processes, in dedicated hardware, a
stream of visual information from a neuromorphic visual sensor
for classification. In the PS side, a Linux operating system is
running, which recollects visual events from the neuromorphic
sensor into a normalized frame, and then it transfers these
frames to the accelerator of multi-layered CNNs, and read results,
using an AXI-DMA bus in a per-layer way. As these kind of
accelerators try to process information as quick as possible, data
bandwidth becomes critical and maintaining a good balanced
data throughput rate requires some considerations. We present
and evaluate several data partitioning techniques to improve the
balance between RX and TX transfer and two different ways
of transfers management: through a polling routine at the userlevel
of the OS, and through a dedicated interrupt-based kernellevel
driver. We demonstrate that for longer enough packets,
the kernel-level driver solution gets better timing in computing a
CNN classification example. Main advantage of using kernel-level
driver is to have safer solutions and to have tasks scheduling in
the OS to manage other important processes for our application,
like frames collection from sensors and their normalization.Ministerio de EconomĂa y Competitividad TEC2016-77785-
Questions related to Bitcoin and other Informational Money
A collection of questions about Bitcoin and its hypothetical relatives
Bitguilder and Bitpenny is formulated. These questions concern technical issues
about protocols, security issues, issues about the formalizations of
informational monies in various contexts, and issues about forms of use and
misuse. Some questions are formulated in the more general setting of
informational monies and near-monies.
We also formulate questions about legal, psychological, and ethical aspects
of informational money. Finally we formulate a number of questions concerning
the economical merits of and outlooks for Bitcoin.Comment: 31 pages. In v2 the section on patterns for use and misuse has been
improved and expanded with so-called contaminations. Other small improvements
were made and 13 additional references have been include
Efficient DMA transfers management on embedded Linux PSoC for Deep-Learning gestures recognition: Using Dynamic Vision Sensor and NullHop one-layer CNN accelerator to play RoShamBo
This demonstration shows a Dynamic Vision Sensor able
to capture visual motion at a speed equivalent to a highspeed
camera (20k fps). The collected visual information is presented as
normalized histogram to a CNN accelerator hardware, called
NullHop, that is able to process a pre-trained CNN to
play Roshambo against a human. The CNN designed for this
purpose consist of 5 convolutional layers and a fully connected
layer. The
latency for processing one histogram is 8ms. NullHop is deployed
on the FPGA fabric of a PSoC from Xilinx, the Zynq 7100, which
is based on a dual-core ARM computer and a Kintex-7 with 444K
logic cells, integrated in the same chip. ARM computer is running
Linux and a specific C++ controller is running the whole
demo. This controller runs at user space in order to extract the
maximum throughput thanks to an efficient use of the AXIStream,
based of
DMA transfers. This short delay needed to process one
visual histogram, allows us to average several consecutive
classification
outputs. Therefore, it provides the best estimation of the symbol
that the user presents to the visual sensor. This output is then
mapped to present the winner symbol within the 60ms latency
that the brain considers acceptable before thinking that there is a
trick.Ministerio de EconomĂa y Competitividad TEC2016-77785-
The embeddedness of organizational performance: multiple membership multiple classification models for the analysis of multilevel networks
We present a Multiple Membership Multiple Classification (MMMC) model for analysing variation in the performance of organizational sub-units embedded in a multilevel network. The model postulates that the performance of organizational sub-units varies across network levels defined in terms of: (i) direct relations between organizational sub-units; (ii) relations between organizations containing the sub-units, and (iii) cross-level relations between sub-units and organizations. We demonstrate the empirical mer- its of the model in an analysis of inter-hospital patient mobility within a regional community of health care organizations. In the empirical case study we develop, organizational sub-units are departments of emergency medicine (EDs) located within hospitals (organizations). Networks within and across levels are delineated in terms of patient transfer relations between EDs (lower-level, emergency transfers), hospitals (higher-level, elective transfers), and between EDs and hospitals (cross-level, non-emergency transfers). Our main analytical objective is to examine the association of these interdependent and par- tially nested levels of action with variation in waiting time among EDs â one of the most commonly adopted and accepted measures of ED performance. We find evidence that variation in ED waiting time is associated with various components of the multilevel network in which the EDs are embedded. Before allowing for various characteristics of EDs and the hospitals in which they are located, we find, for the null models, that most of the network variation is at the hospital level. After adding these characteris- tics to the model, we find that hospital capacity and ED uncertainty are significantly associated with ED waiting time. We also find that the overall variation in ED waiting time is reduced to less than a half of its estimated value from the null models, and that a greater share of the residual network variation for these models is at the ED level and cross level, rather than the hospital level. This suggests that the covari- ates explain some of the network variation, and shift the relative share of residual variation away from hospital networks. We discuss further extensions to the model for more general analyses of multilevel network dependencies in variables of interest for the lower level nodes of these social structures
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Near-sensor data analytics is a promising direction for IoT endpoints, as it
minimizes energy spent on communication and reduces network load - but it also
poses security concerns, as valuable data is stored or sent over the network at
various stages of the analytics pipeline. Using encryption to protect sensitive
data at the boundary of the on-chip analytics engine is a way to address data
security issues. To cope with the combined workload of analytics and encryption
in a tight power envelope, we propose Fulmine, a System-on-Chip based on a
tightly-coupled multi-core cluster augmented with specialized blocks for
compute-intensive data processing and encryption functions, supporting software
programmability for regular computing tasks. The Fulmine SoC, fabricated in
65nm technology, consumes less than 20mW on average at 0.8V achieving an
efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to
25MIPS/mW in software. As a strong argument for real-life flexible application
of our platform, we show experimental results for three secure analytics use
cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN
consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with
secured remote recognition in 5.74pJ/op; and seizure detection with encrypted
data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE
Transactions on Circuits and Systems - I: Regular Paper
- âŠ