Search CORE

146,040 research outputs found

Guest editors' introduction to the special section on learning with Shared information for computer vision and multimedia analysis

Author: Darrell Trevor
Lampert Christoph
Sebe Nico
Wu Ying
Yan Yan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The twelve papers in this special section focus on learning systems with shared information for computer vision and multimedia communication analysis. In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes containing a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with shared information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different levels of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems

IST Austria: PubRep (Institute of Science and Technology)

TensorLayer: A Versatile Library for Efficient Deep Learning Development

Author: Dong Hao
Guo Yike
Liu Fangde
Mai Luo
Oehmichen Axel
Supratak Akara
Yu Simiao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/08/2017
Field of study

Deep learning has enabled major advances in the fields of computer vision, natural language processing, and multimedia among many others. Developing a deep learning system is arduous and complex, as it involves constructing neural network architectures, managing training/trained models, tuning optimization process, preprocessing and organizing data, etc. TensorLayer is a versatile Python library that aims at helping researchers and engineers efficiently develop deep learning systems. It offers rich abstractions for neural networks, model and data management, and parallel workflow mechanism. While boosting efficiency, TensorLayer maintains both performance and scalability. TensorLayer was released in September 2016 on GitHub, and has helped people from academia and industry develop real-world applications of deep learning.Comment: ACM Multimedia 201

arXiv.org e-Print Archive

Crossref

Reliable camera motion estimation from compressed MPEG videos using machine learning approach

Author: Jiang Jianmin
Ren Jinchang
Sun Meijun
Wang Yubin
Wang Zheng
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 25/07/2013
Field of study

As an important feature in characterizing video content, camera motion has been widely applied in various multimedia and computer vision applications. A novel method for fast and reliable estimation of camera motion from MPEG videos is proposed, using support vector machine for estimation in a regression model trained on a synthesized sequence. Experiments conducted on real sequences show that the proposed method yields much improved results in estimating camera motions while the difficulty in selecting valid macroblocks and motion vectors is skipped

University of Strathclyde Institutional Repository

Do GANs leave artificial fingerprints?

Author: Gragnaniello Diego
Marra Francesco
Poggi Giovanni
Verdoliva Luisa
Publication venue
Publication date: 31/12/2018
Field of study

In the last few years, generative adversarial networks (GAN) have shown tremendous potential for a number of applications in computer vision and related fields. With the current pace of progress, it is a sure bet they will soon be able to generate high-quality images and videos, virtually indistinguishable from real ones. Unfortunately, realistic GAN-generated images pose serious threats to security, to begin with a possible flood of fake multimedia, and multimedia forensic countermeasures are in urgent need. In this work, we show that each GAN leaves its specific fingerprint in the images it generates, just like real-world cameras mark acquired images with traces of their photo-response non-uniformity pattern. Source identification experiments with several popular GANs show such fingerprints to represent a precious asset for forensic analyses

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

Archivio della Ricerca - Università di Salerno

A Survey of Multimedia Technologies and Robust Algorithms

Author: Kuang Zijian
Tie Xinran
Publication venue
Publication date: 24/03/2021
Field of study

Multimedia technologies are now more practical and deployable in real life, and the algorithms are widely used in various researching areas such as deep learning, signal processing, haptics, computer vision, robotics, and medical multimedia processing. This survey provides an overview of multimedia technologies and robust algorithms in multimedia data processing, medical multimedia processing, human facial expression tracking and pose recognition, and multimedia in education and training. This survey will also analyze and propose a future research direction based on the overview of current robust algorithms and multimedia technologies. We want to thank the research and previous work done by the Multimedia Research Centre (MRC), the University of Alberta, which is the inspiration and starting point for future research.Comment: arXiv admin note: text overlap with arXiv:2010.1296

arXiv.org e-Print Archive

Hal-Diderot

Real time web-based toolbox for computer vision

Author: Adoui Mohammed El
Belarbi Mohammed Amin
Larhmam Mohammed Amine
Lecron Fabian
Mahmoudi Sidi Ahmed
Publication venue: 'Universidade Catolica Portuguesa'
Publication date: 01/05/2018
Field of study

The last few years have been strongly marked by the presence of multimedia data (images and videos) in our everyday lives. These data are characterized by a fast frequency of creation and sharing since images and videos can come from different devices such as cameras, smartphones or drones. The latter are generally used to illustrate objects in different situations (airports, hospitals, public areas, sport games, etc.). As result, image and video processing algorithms have got increasing importance for several computer vision applications such as motion tracking, event detection and recognition, multimedia indexation and medical computer-aided diagnosis methods. In this paper, we propose a real time cloud-based toolbox (platform) for computer vision applications. This platform integrates a toolbox of image and video processing algorithms that can be run in real time and in a secure way. The related libraries and hardware drivers are automatically integrated and configured in order to offer to users an access to the different algorithms without the need to download, install and configure software or hardware. Moreover, the platform offers the access to the integrated applications from multiple users thanks to the use of Docker (Merkel, 2014) containers and images. Experimentations were conducted within three kinds of algorithms: 1. image processing toolbox. 2. Video processing toolbox. 3. 3D medical methods such as computer-aided diagnosis for scoliosis and osteoporosis.  These experimentations demonstrated the interest of our platform for sharing our scientific contributions related to computer vision domain. The scientific researchers could be able to develop and share easily their applications fastly and in a safe way

Revistas Científicas da Universidade Católica Portuguesa