117,455 research outputs found

    LSDA: Large Scale Detection Through Adaptation

    Full text link
    A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1.2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7.6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7.6K detector). Models and software are available a

    Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator

    Get PDF
    Many FPGAs vendors have recently included embedded processors in their devices, like Xilinx with ARM-Cortex A cores, together with programmable logic cells. These devices are known as Programmable System on Chip (PSoC). Their ARM cores (embedded in the processing system or PS) communicates with the programmable logic cells (PL) using ARM-standard AXI buses. In this paper we analyses the performance of exhaustive data transfers between PS and PL for a Xilinx Zynq FPGA in a co-design real scenario for Convolutional Neural Networks (CNN) accelerator, which processes, in dedicated hardware, a stream of visual information from a neuromorphic visual sensor for classification. In the PS side, a Linux operating system is running, which recollects visual events from the neuromorphic sensor into a normalized frame, and then it transfers these frames to the accelerator of multi-layered CNNs, and read results, using an AXI-DMA bus in a per-layer way. As these kind of accelerators try to process information as quick as possible, data bandwidth becomes critical and maintaining a good balanced data throughput rate requires some considerations. We present and evaluate several data partitioning techniques to improve the balance between RX and TX transfer and two different ways of transfers management: through a polling routine at the userlevel of the OS, and through a dedicated interrupt-based kernellevel driver. We demonstrate that for longer enough packets, the kernel-level driver solution gets better timing in computing a CNN classification example. Main advantage of using kernel-level driver is to have safer solutions and to have tasks scheduling in the OS to manage other important processes for our application, like frames collection from sensors and their normalization.Ministerio de EconomĂ­a y Competitividad TEC2016-77785-

    Questions related to Bitcoin and other Informational Money

    Get PDF
    A collection of questions about Bitcoin and its hypothetical relatives Bitguilder and Bitpenny is formulated. These questions concern technical issues about protocols, security issues, issues about the formalizations of informational monies in various contexts, and issues about forms of use and misuse. Some questions are formulated in the more general setting of informational monies and near-monies. We also formulate questions about legal, psychological, and ethical aspects of informational money. Finally we formulate a number of questions concerning the economical merits of and outlooks for Bitcoin.Comment: 31 pages. In v2 the section on patterns for use and misuse has been improved and expanded with so-called contaminations. Other small improvements were made and 13 additional references have been include

    Efficient DMA transfers management on embedded Linux PSoC for Deep-Learning gestures recognition: Using Dynamic Vision Sensor and NullHop one-layer CNN accelerator to play RoShamBo

    Get PDF
    This demonstration shows a Dynamic Vision Sensor able to capture visual motion at a speed equivalent to a highspeed camera (20k fps). The collected visual information is presented as normalized histogram to a CNN accelerator hardware, called NullHop, that is able to process a pre-trained CNN to play Roshambo against a human. The CNN designed for this purpose consist of 5 convolutional layers and a fully connected layer. The latency for processing one histogram is 8ms. NullHop is deployed on the FPGA fabric of a PSoC from Xilinx, the Zynq 7100, which is based on a dual-core ARM computer and a Kintex-7 with 444K logic cells, integrated in the same chip. ARM computer is running Linux and a specific C++ controller is running the whole demo. This controller runs at user space in order to extract the maximum throughput thanks to an efficient use of the AXIStream, based of DMA transfers. This short delay needed to process one visual histogram, allows us to average several consecutive classification outputs. Therefore, it provides the best estimation of the symbol that the user presents to the visual sensor. This output is then mapped to present the winner symbol within the 60ms latency that the brain considers acceptable before thinking that there is a trick.Ministerio de EconomĂ­a y Competitividad TEC2016-77785-

    The embeddedness of organizational performance: multiple membership multiple classification models for the analysis of multilevel networks

    Get PDF
    We present a Multiple Membership Multiple Classification (MMMC) model for analysing variation in the performance of organizational sub-units embedded in a multilevel network. The model postulates that the performance of organizational sub-units varies across network levels defined in terms of: (i) direct relations between organizational sub-units; (ii) relations between organizations containing the sub-units, and (iii) cross-level relations between sub-units and organizations. We demonstrate the empirical mer- its of the model in an analysis of inter-hospital patient mobility within a regional community of health care organizations. In the empirical case study we develop, organizational sub-units are departments of emergency medicine (EDs) located within hospitals (organizations). Networks within and across levels are delineated in terms of patient transfer relations between EDs (lower-level, emergency transfers), hospitals (higher-level, elective transfers), and between EDs and hospitals (cross-level, non-emergency transfers). Our main analytical objective is to examine the association of these interdependent and par- tially nested levels of action with variation in waiting time among EDs – one of the most commonly adopted and accepted measures of ED performance. We find evidence that variation in ED waiting time is associated with various components of the multilevel network in which the EDs are embedded. Before allowing for various characteristics of EDs and the hospitals in which they are located, we find, for the null models, that most of the network variation is at the hospital level. After adding these characteris- tics to the model, we find that hospital capacity and ED uncertainty are significantly associated with ED waiting time. We also find that the overall variation in ED waiting time is reduced to less than a half of its estimated value from the null models, and that a greater share of the residual network variation for these models is at the ED level and cross level, rather than the hospital level. This suggests that the covari- ates explain some of the network variation, and shift the relative share of residual variation away from hospital networks. We discuss further extensions to the model for more general analyses of multilevel network dependencies in variables of interest for the lower level nodes of these social structures

    An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

    Full text link
    Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper
    • 

    corecore