Search CORE

294 research outputs found

Volume 24 Number 2

Author: EIU College of Education
Publication venue: The Keep
Publication date: 01/10/1995
Field of study

https://thekeep.eiu.edu/eej/1047/thumbnail.jp

Eastern Illinois University

Volume 24 Number 2

Author: EIU College of Education
Publication venue: The Keep
Publication date: 01/10/1995
Field of study

https://thekeep.eiu.edu/eej/1047/thumbnail.jp

Eastern Illinois University

Objects for spatio-temporal activity recognition in videos

Author: Mettes P.S.M.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Data Science and Knowledge Discovery

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

Data Science (DS) is gaining significant importance in the decision process due to a mix of various areas, including Computer Science, Machine Learning, Math and Statistics, domain/business knowledge, software development, and traditional research. In the business field, DS's application allows using scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data to support the decision process. After collecting the data, it is crucial to discover the knowledge. In this step, Knowledge Discovery (KD) tasks are used to create knowledge from structured and unstructured sources (e.g., text, data, and images). The output needs to be in a readable and interpretable format. It must represent knowledge in a manner that facilitates inferencing. KD is applied in several areas, such as education, health, accounting, energy, and public administration. This book includes fourteen excellent articles which discuss this trending topic and present innovative solutions to show the importance of Data Science and Knowledge Discovery to researchers, managers, industry, society, and other communities. The chapters address several topics like Data mining, Deep Learning, Data Visualization and Analytics, Semantic data, Geospatial and Spatio-Temporal Data, Data Augmentation and Text Mining

Directory of Open Access Books (DOAB)

A Novel Approach on Visual Question Answering by Parameter Prediction using Faster Region Based Convolutional Neural Network

Author: Dey Anirban
Jha Sudan
Kumar Raghvendra
Kumar-Solanki Vijender
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 24/02/2022
Field of study

Visual Question Answering (VQA) is a stimulating process in the ﬁeld of Natural Language Processing (NLP) and Computer Vision (CV). In this process machine can find an answer to a natural language question which is related to an image. Question can be open-ended or multiple choice. Datasets of VQA contain mainly three components; questions, images and answers. Researchers overcome the VQA problem with deep learning based architecture that jointly combines both of two networks i.e. Convolution Neural Network (CNN) for visual (image) representation and Recurrent Neural Network (RNN) with Long Short Time Memory (LSTM) for textual (question) representation and trained the combined network end to end to generate the answer. Those models are able to answer the common and simple questions that are directly related to the image’s content. But different types of questions need different level of understanding to produce correct answers. To solve this problem, we use faster Region based-CNN (R-CNN) for extracting image features with an extra fully connected layer whose weights are dynamically obtained by LSTMs cell according to the question. We claim in this paper that a single R-CNN architecture can solve the problems related to VQA by modifying weights in the parameter prediction layer. Authors trained the network end to end by Stochastic Gradient Descent (SGD) using pretrained faster R-CNN and LSTM and tested it on benchmark datasets of VQA

Re-UNIR

Natural Language Description of Images and Videos

Author: Shetty Rakshith
Publication venue
Publication date: 26/09/2016
Field of study

Understanding visual media, i.e. images and videos, has been a cornerstone topic in computer vision research for a long time. Recently, a new task within the purview of this research area, that of automatically captioning images and videos, has garnered wide-spread interest. The task involves generating a short natural language description of an image or a video. This thesis studies the automatic visual captioning problem in its entirety. A baseline visual captioning pipeline is examined, including its two constituent blocks, namely visual feature extraction and language modeling. We then discuss the challenges involved and the methods available to evaluate a visual captioning system. Building on this baseline model, several enhancements are proposed to improve the performance of both the visual feature extraction and the language modeling. Deep convolutional neural network based image features used in the baseline model are augmented with explicit object and scene detection features. In the case of videos, a combination of action recognition and static frame-level features are used. The long-short term memory network based language model used in the baseline is extended by introduction of an additional input channel and residual connections. Finally, an efficient ensembling technique based on a caption evaluator network is presented. Results from extensive experiments conducted to evaluate each of the above mentioned enhancements are reported. The image and video captioning architectures proposed in this thesis achieve state-of-the-art performance on the corresponding tasks. To support these claims, results from two video captioning challenges organized over the last year are reported, both of which were won by the models presented in the thesis. We also quantitatively analyze the automatic captions generated and identify several shortcomings of the current system. After having identified the deficiencies, we briefly look at a few interesting problems which could take the automatic visual captioning research forward

Aaltodoc Publication Archive

Deep Learning for Logo Detection: A Survey

Author: Hou Qiang
Hou Sujuan
Jiang Shuqiang
Li Jiacheng
Min Weiqing
Zhao Yanna
Zheng Yuanjie
Publication venue
Publication date: 09/10/2022
Field of study

When logos are increasingly created, logo detection has gradually become a research hotspot across many domains and tasks. Recent advances in this area are dominated by deep learning-based solutions, where many datasets, learning strategies, network architectures, etc. have been employed. This paper reviews the advance in applying deep learning techniques to logo detection. Firstly, we discuss a comprehensive account of public datasets designed to facilitate performance evaluation of logo detection algorithms, which tend to be more diverse, more challenging, and more reflective of real life. Next, we perform an in-depth analysis of the existing logo detection strategies and the strengths and weaknesses of each learning strategy. Subsequently, we summarize the applications of logo detection in various fields, from intelligent transportation and brand monitoring to copyright and trademark compliance. Finally, we analyze the potential challenges and present the future directions for the development of logo detection to complete this survey

arXiv.org e-Print Archive

Handbook of Digital Face Manipulation and Detection

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/02/2022
Field of study

This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

Directory of Open Access Books (DOAB)

The BG News February 22, 2010

Author: Bowling Green State University
Publication venue: ScholarWorks@BGSU
Publication date: 22/02/2010
Field of study

The BGSU campus student newspaper February 22, 2010. Volume 100 - Issue 104https://scholarworks.bgsu.edu/bg-news/9207/thumbnail.jp

Bowling Green State University: ScholarWorks@BGSU

Person Re-identification: Past, Present and Future

Author: Hauptmann AG
Yang Y
Zheng L
Publication venue
Publication date: 01/01/2016
Field of study

Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues

arXiv.org e-Print Archive

OPUS - University of Technology Sydney