1,351 research outputs found
A Survey of Deep Learning-Based Object Detection
Object detection is one of the most important and challenging branches of
computer vision, which has been widely applied in peoples life, such as
monitoring security, autonomous driving and so on, with the purpose of locating
instances of semantic objects of a certain class. With the rapid development of
deep learning networks for detection tasks, the performance of object detectors
has been greatly improved. In order to understand the main development status
of object detection pipeline, thoroughly and deeply, in this survey, we first
analyze the methods of existing typical detection models and describe the
benchmark datasets. Afterwards and primarily, we provide a comprehensive
overview of a variety of object detection methods in a systematic manner,
covering the one-stage and two-stage detectors. Moreover, we list the
traditional and new applications. Some representative branches of object
detection are analyzed as well. Finally, we discuss the architecture of
exploiting these object detection methods to build an effective and efficient
system and point out a set of development trends to better follow the
state-of-the-art algorithms and further research.Comment: 30 pages,12 figure
Unveiling the frontiers of deep learning: innovations shaping diverse domains
Deep learning (DL) enables the development of computer models that are
capable of learning, visualizing, optimizing, refining, and predicting data. In
recent years, DL has been applied in a range of fields, including audio-visual
data processing, agriculture, transportation prediction, natural language,
biomedicine, disaster management, bioinformatics, drug design, genomics, face
recognition, and ecology. To explore the current state of deep learning, it is
necessary to investigate the latest developments and applications of deep
learning in these disciplines. However, the literature is lacking in exploring
the applications of deep learning in all potential sectors. This paper thus
extensively investigates the potential applications of deep learning across all
major fields of study as well as the associated benefits and challenges. As
evidenced in the literature, DL exhibits accuracy in prediction and analysis,
makes it a powerful computational tool, and has the ability to articulate
itself and optimize, making it effective in processing data with no prior
training. Given its independence from training data, deep learning necessitates
massive amounts of data for effective analysis and processing, much like data
volume. To handle the challenge of compiling huge amounts of medical,
scientific, healthcare, and environmental data for use in deep learning, gated
architectures like LSTMs and GRUs can be utilized. For multimodal learning,
shared neurons in the neural network for all activities and specialized neurons
for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table
Affective Image Content Analysis: Two Decades Review and New Perspectives
Images can convey rich semantics and induce various emotions in viewers.
Recently, with the rapid advancement of emotional intelligence and the
explosive growth of visual data, extensive research efforts have been dedicated
to affective image content analysis (AICA). In this survey, we will
comprehensively review the development of AICA in the recent two decades,
especially focusing on the state-of-the-art methods with respect to three main
challenges -- the affective gap, perception subjectivity, and label noise and
absence. We begin with an introduction to the key emotion representation models
that have been widely employed in AICA and description of available datasets
for performing evaluation with quantitative comparison of label noise and
dataset bias. We then summarize and compare the representative approaches on
(1) emotion feature extraction, including both handcrafted and deep features,
(2) learning methods on dominant emotion recognition, personalized emotion
prediction, emotion distribution learning, and learning from noisy data or few
labels, and (3) AICA based applications. Finally, we discuss some challenges
and promising research directions in the future, such as image content and
context understanding, group emotion clustering, and viewer-image interaction.Comment: Accepted by IEEE TPAM
Fuzzy Logic in Surveillance Big Video Data Analysis: Comprehensive Review, Challenges, and Research Directions
CCTV cameras installed for continuous surveillance generate enormous amounts of data daily, forging the term “Big Video Data” (BVD). The active practice of BVD includes intelligent surveillance and activity recognition, among other challenging tasks. To efficiently address these tasks, the computer vision research community has provided monitoring systems, activity recognition methods, and many other computationally complex solutions for the purposeful usage of BVD. Unfortunately, the limited capabilities of these methods, higher computational complexity, and stringent installation requirements hinder their practical implementation in real-world scenarios, which still demand human operators sitting in front of cameras to monitor activities or make actionable decisions based on BVD. The usage of human-like logic, known as fuzzy logic, has been employed emerging for various data science applications such as control systems, image processing, decision making, routing, and advanced safety-critical systems. This is due to its ability to handle various sources of real world domain and data uncertainties, generating easily adaptable and explainable data-based models. Fuzzy logic can be effectively used for surveillance as a complementary for huge-sized artificial intelligence models and tiresome training procedures. In this paper, we draw researchers’ attention towards the usage of fuzzy logic for surveillance in the context of BVD. We carry out a comprehensive literature survey of methods for vision sensory data analytics that resort to fuzzy logic concepts. Our overview highlights the advantages, downsides, and challenges in existing video analysis methods based on fuzzy logic for surveillance applications. We enumerate and discuss the datasets used by these methods, and finally provide an outlook towards future research directions derived from our critical assessment of the efforts invested so far in this exciting field
Bootstrap–CURE: A novel clustering approach for sensor data: an application to 3D printing industry
The agenda of Industry 4.0 highlights smart manufacturing by making machines smart enough to make data-driven decisions. Large-scale 3D printers, being one of the important pillars in Industry 4.0, are equipped with smart sensors to continuously monitor print processes and make automated decisions. One of the biggest challenges in decision autonomy is to consume data quickly along the process and extract knowledge from the printer, suitable for improving the printing process. This paper presents the innovative unsupervised learning approach, bootstrap–CURE, to decode the sensor patterns and operation modes of 3D printers by analyzing multivariate sensor data. An automatic technique to detect the suitable number of clusters using the dendrogram is developed. The proposed methodology is scalable and significantly reduces computational cost as compared to classical CURE. A distinct combination of the 3D printer’s sensors is found, and its impact on the printing process is also discussed. A real application is presented to illustrate the performance and usefulness of the proposal. In addition, a new state of the art for sensor data analysis is presented.This work was supported in part by KEMLG-at-IDEAI (UPC) under Grant SGR-2017-574 from the Catalan government.Peer ReviewedPostprint (published version
- …