17,586 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
A sub-mW IoT-endnode for always-on visual monitoring and smart triggering
This work presents a fully-programmable Internet of Things (IoT) visual
sensing node that targets sub-mW power consumption in always-on monitoring
scenarios. The system features a spatial-contrast binary
pixel imager with focal-plane processing. The sensor, when working at its
lowest power mode ( at 10 fps), provides as output the number of
changed pixels. Based on this information, a dedicated camera interface,
implemented on a low-power FPGA, wakes up an ultra-low-power parallel
processing unit to extract context-aware visual information. We evaluate the
smart sensor on three always-on visual triggering application scenarios.
Triggering accuracy comparable to RGB image sensors is achieved at nominal
lighting conditions, while consuming an average power between and
, depending on context activity. The digital sub-system is extremely
flexible, thanks to a fully-programmable digital signal processing engine, but
still achieves 19x lower power consumption compared to MCU-based cameras with
significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa
Design of multimedia processor based on metric computation
Media-processing applications, such as signal processing, 2D and 3D graphics
rendering, and image compression, are the dominant workloads in many embedded
systems today. The real-time constraints of those media applications have
taxing demands on today's processor performances with low cost, low power and
reduced design delay. To satisfy those challenges, a fast and efficient
strategy consists in upgrading a low cost general purpose processor core. This
approach is based on the personalization of a general RISC processor core
according the target multimedia application requirements. Thus, if the extra
cost is justified, the general purpose processor GPP core can be enforced with
instruction level coprocessors, coarse grain dedicated hardware, ad hoc
memories or new GPP cores. In this way the final design solution is tailored to
the application requirements. The proposed approach is based on three main
steps: the first one is the analysis of the targeted application using
efficient metrics. The second step is the selection of the appropriate
architecture template according to the first step results and recommendations.
The third step is the architecture generation. This approach is experimented
using various image and video algorithms showing its feasibility
A brief network analysis of Artificial Intelligence publication
In this paper, we present an illustration to the history of Artificial
Intelligence(AI) with a statistical analysis of publish since 1940. We
collected and mined through the IEEE publish data base to analysis the
geological and chronological variance of the activeness of research in AI. The
connections between different institutes are showed. The result shows that the
leading community of AI research are mainly in the USA, China, the Europe and
Japan. The key institutes, authors and the research hotspots are revealed. It
is found that the research institutes in the fields like Data Mining, Computer
Vision, Pattern Recognition and some other fields of Machine Learning are quite
consistent, implying a strong interaction between the community of each field.
It is also showed that the research of Electronic Engineering and Industrial or
Commercial applications are very active in California. Japan is also publishing
a lot of papers in robotics. Due to the limitation of data source, the result
might be overly influenced by the number of published articles, which is to our
best improved by applying network keynode analysis on the research community
instead of merely count the number of publish.Comment: 18 pages, 7 figure
A Large-scale Distributed Video Parsing and Evaluation Platform
Visual surveillance systems have become one of the largest data sources of
Big Visual Data in real world. However, existing systems for video analysis
still lack the ability to handle the problems of scalability, expansibility and
error-prone, though great advances have been achieved in a number of visual
recognition tasks and surveillance applications, e.g., pedestrian/vehicle
detection, people/vehicle counting. Moreover, few algorithms explore the
specific values/characteristics in large-scale surveillance videos. To address
these problems in large-scale video analysis, we develop a scalable video
parsing and evaluation platform through combining some advanced techniques for
Big Data processing, including Spark Streaming, Kafka and Hadoop Distributed
Filesystem (HDFS). Also, a Web User Interface is designed in the system, to
collect users' degrees of satisfaction on the recognition tasks so as to
evaluate the performance of the whole system. Furthermore, the highly
extensible platform running on the long-term surveillance videos makes it
possible to develop more intelligent incremental algorithms to enhance the
performance of various visual recognition tasks.Comment: Accepted by Chinese Conference on Intelligent Visual Surveillance
201
- …