146,040 research outputs found

    Guest editors' introduction to the special section on learning with Shared information for computer vision and multimedia analysis

    Get PDF
    The twelve papers in this special section focus on learning systems with shared information for computer vision and multimedia communication analysis. In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes containing a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with shared information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different levels of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems

    TensorLayer: A Versatile Library for Efficient Deep Learning Development

    Full text link
    Deep learning has enabled major advances in the fields of computer vision, natural language processing, and multimedia among many others. Developing a deep learning system is arduous and complex, as it involves constructing neural network architectures, managing training/trained models, tuning optimization process, preprocessing and organizing data, etc. TensorLayer is a versatile Python library that aims at helping researchers and engineers efficiently develop deep learning systems. It offers rich abstractions for neural networks, model and data management, and parallel workflow mechanism. While boosting efficiency, TensorLayer maintains both performance and scalability. TensorLayer was released in September 2016 on GitHub, and has helped people from academia and industry develop real-world applications of deep learning.Comment: ACM Multimedia 201

    Reliable camera motion estimation from compressed MPEG videos using machine learning approach

    Get PDF
    As an important feature in characterizing video content, camera motion has been widely applied in various multimedia and computer vision applications. A novel method for fast and reliable estimation of camera motion from MPEG videos is proposed, using support vector machine for estimation in a regression model trained on a synthesized sequence. Experiments conducted on real sequences show that the proposed method yields much improved results in estimating camera motions while the difficulty in selecting valid macroblocks and motion vectors is skipped

    Do GANs leave artificial fingerprints?

    Full text link
    In the last few years, generative adversarial networks (GAN) have shown tremendous potential for a number of applications in computer vision and related fields. With the current pace of progress, it is a sure bet they will soon be able to generate high-quality images and videos, virtually indistinguishable from real ones. Unfortunately, realistic GAN-generated images pose serious threats to security, to begin with a possible flood of fake multimedia, and multimedia forensic countermeasures are in urgent need. In this work, we show that each GAN leaves its specific fingerprint in the images it generates, just like real-world cameras mark acquired images with traces of their photo-response non-uniformity pattern. Source identification experiments with several popular GANs show such fingerprints to represent a precious asset for forensic analyses

    A Survey of Multimedia Technologies and Robust Algorithms

    Full text link
    Multimedia technologies are now more practical and deployable in real life, and the algorithms are widely used in various researching areas such as deep learning, signal processing, haptics, computer vision, robotics, and medical multimedia processing. This survey provides an overview of multimedia technologies and robust algorithms in multimedia data processing, medical multimedia processing, human facial expression tracking and pose recognition, and multimedia in education and training. This survey will also analyze and propose a future research direction based on the overview of current robust algorithms and multimedia technologies. We want to thank the research and previous work done by the Multimedia Research Centre (MRC), the University of Alberta, which is the inspiration and starting point for future research.Comment: arXiv admin note: text overlap with arXiv:2010.1296

    Real time web-based toolbox for computer vision

    Get PDF
    The last few years have been strongly marked by the presence of multimedia data (images and videos) in our everyday lives. These data are characterized by a fast frequency of creation and sharing since images and videos can come from different devices such as cameras, smartphones or drones. The latter are generally used to illustrate objects in different situations (airports, hospitals, public areas, sport games, etc.). As result, image and video processing algorithms have got increasing importance for several computer vision applications such as motion tracking, event detection and recognition, multimedia indexation and medical computer-aided diagnosis methods. In this paper, we propose a real time cloud-based toolbox (platform) for computer vision applications. This platform integrates a toolbox of image and video processing algorithms that can be run in real time and in a secure way. The related libraries and hardware drivers are automatically integrated and configured in order to offer to users an access to the different algorithms without the need to download, install and configure software or hardware. Moreover, the platform offers the access to the integrated applications from multiple users thanks to the use of Docker (Merkel, 2014) containers and images. Experimentations were conducted within three kinds of algorithms: 1. image processing toolbox. 2. Video processing toolbox. 3. 3D medical methods such as computer-aided diagnosis for scoliosis and osteoporosis.  These experimentations demonstrated the interest of our platform for sharing our scientific contributions related to computer vision domain. The scientific researchers could be able to develop and share easily their applications fastly and in a safe way
    • …
    corecore