146,040 research outputs found
Guest editors' introduction to the special section on learning with Shared information for computer vision and multimedia analysis
The twelve papers in this special section focus on learning systems with shared information for computer vision and multimedia communication analysis. In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes containing a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with shared information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different levels of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems
TensorLayer: A Versatile Library for Efficient Deep Learning Development
Deep learning has enabled major advances in the fields of computer vision,
natural language processing, and multimedia among many others. Developing a
deep learning system is arduous and complex, as it involves constructing neural
network architectures, managing training/trained models, tuning optimization
process, preprocessing and organizing data, etc. TensorLayer is a versatile
Python library that aims at helping researchers and engineers efficiently
develop deep learning systems. It offers rich abstractions for neural networks,
model and data management, and parallel workflow mechanism. While boosting
efficiency, TensorLayer maintains both performance and scalability. TensorLayer
was released in September 2016 on GitHub, and has helped people from academia
and industry develop real-world applications of deep learning.Comment: ACM Multimedia 201
Reliable camera motion estimation from compressed MPEG videos using machine learning approach
As an important feature in characterizing video content, camera motion has been widely applied in various multimedia and computer vision applications. A novel method for fast and reliable estimation of camera motion from MPEG videos is proposed, using support vector machine for estimation in a regression model trained on a synthesized sequence. Experiments conducted on real sequences show that the proposed method yields much improved results in estimating camera motions while the difficulty in selecting valid macroblocks and motion vectors is skipped
Do GANs leave artificial fingerprints?
In the last few years, generative adversarial networks (GAN) have shown
tremendous potential for a number of applications in computer vision and
related fields. With the current pace of progress, it is a sure bet they will
soon be able to generate high-quality images and videos, virtually
indistinguishable from real ones. Unfortunately, realistic GAN-generated images
pose serious threats to security, to begin with a possible flood of fake
multimedia, and multimedia forensic countermeasures are in urgent need. In this
work, we show that each GAN leaves its specific fingerprint in the images it
generates, just like real-world cameras mark acquired images with traces of
their photo-response non-uniformity pattern. Source identification experiments
with several popular GANs show such fingerprints to represent a precious asset
for forensic analyses
A Survey of Multimedia Technologies and Robust Algorithms
Multimedia technologies are now more practical and deployable in real life,
and the algorithms are widely used in various researching areas such as deep
learning, signal processing, haptics, computer vision, robotics, and medical
multimedia processing. This survey provides an overview of multimedia
technologies and robust algorithms in multimedia data processing, medical
multimedia processing, human facial expression tracking and pose recognition,
and multimedia in education and training. This survey will also analyze and
propose a future research direction based on the overview of current robust
algorithms and multimedia technologies. We want to thank the research and
previous work done by the Multimedia Research Centre (MRC), the University of
Alberta, which is the inspiration and starting point for future research.Comment: arXiv admin note: text overlap with arXiv:2010.1296
Real time web-based toolbox for computer vision
The last few years have been strongly marked by the presence of multimedia data (images and videos) in our everyday lives. These data are characterized by a fast frequency of creation and sharing since images and videos can come from different devices such as cameras, smartphones or drones. The latter are generally used to illustrate objects in different situations (airports, hospitals, public areas, sport games, etc.). As result, image and video processing algorithms have got increasing importance for several computer vision applications such as motion tracking, event detection and recognition, multimedia indexation and medical computer-aided diagnosis methods. In this paper, we propose a real time cloud-based toolbox (platform) for computer vision applications. This platform integrates a toolbox of image and video processing algorithms that can be run in real time and in a secure way. The related libraries and hardware drivers are automatically integrated and configured in order to offer to users an access to the different algorithms without the need to download, install and configure software or hardware. Moreover, the platform offers the access to the integrated applications from multiple users thanks to the use of Docker (Merkel, 2014) containers and images. Experimentations were conducted within three kinds of algorithms: 1. image processing toolbox. 2. Video processing toolbox. 3. 3D medical methods such as computer-aided diagnosis for scoliosis and osteoporosis. These experimentations demonstrated the interest of our platform for sharing our scientific contributions related to computer vision domain. The scientific researchers could be able to develop and share easily their applications fastly and in a safe way
- …