Search CORE

11 research outputs found

PIM: Video Coding using Perceptual Importance Maps

Author: Anderson Alexander G.
Bourdev Lubomir
Katti Sachin
Olshausen Bruno
Pergament Evgenya
Rippel Oren
Tandon Pulkit
Tatwawadi Kedar
Weissman Tsachy
Publication venue
Publication date: 09/04/2023
Field of study

Human perception is at the core of lossy video compression, with numerous approaches developed for perceptual quality assessment and improvement over the past two decades. In the determination of perceptual quality, different spatio-temporal regions of the video differ in their relative importance to the human viewer. However, since it is challenging to infer or even collect such fine-grained information, it is often not used during compression beyond low-level heuristics. We present a framework which facilitates research into fine-grained subjective importance in compressed videos, which we then utilize to improve the rate-distortion performance of an existing video codec (x264). The contributions of this work are threefold: (1) we introduce a web-tool which allows scalable collection of fine-grained perceptual importance, by having users interactively paint spatio-temporal maps over encoded videos; (2) we use this tool to collect a dataset with 178 videos with a total of 14443 frames of human annotated spatio-temporal importance maps over the videos; and (3) we use our curated dataset to train a lightweight machine learning model which can predict these spatio-temporal importance regions. We demonstrate via a subjective study that encoding the videos in our dataset while taking into account the importance maps leads to higher perceptual quality at the same bitrate, with the videos encoded with importance maps preferred

1.8 \times

over the baseline videos. Similarly, we show that for the 18 videos in test set, the importance maps predicted by our model lead to higher perceptual quality videos,

2 \times

preferred over the baseline at the same bitrate

arXiv.org e-Print Archive

Sculpting representations for deep learning

Author: Rippel Oren
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2016
Field of study

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2016.Cataloged from PDF version of thesis.Includes bibliographical references (pages 149-164).In machine learning, the choice of space in which to represent our data is of vital importance to their effective and efficient analysis. In this thesis, we develop approaches to address a number of problems in representation learning. We employ deep learning as means of sculpting our representations, and also develop improved representations for deep learning models. We present contributions that are based on five papers and make progress in several different research directions. First, we present techniques which leverage spatial and relational structure to achieve greater computational efficiency of model optimization and query retrieval. This allows us to train distance metric learning models 5-30 times faster; optimize convolutional neural networks 2-5 times faster; perform content-based image retrieval hundreds of times faster on codes hundreds of times longer than feasible before; and improve the complexity of Bayesian optimization to linear in the number of observations in contrast to the cubic dependence in its naive Gaussian process formulation. Furthermore, we introduce ideas to facilitate preservation of relevant information within the learned representations, and demonstrate this leads to improved supervision results. Our approaches achieve state-of-the-art classification and transfer learning performance on a number of well-known machine learning benchmarks. In addition, while deep learning models are able to discover structure in high dimensional input domains, they only offer implicit probabilistic descriptions. We develop an algorithm to enable probabilistic interpretability of deep representations. It constructs a transformation to a representation space under which the map of the distribution is approximately factorized and has known marginals. This allows tractable density estimation and.inference within this alternate domain.by Oren Rippel.Ph. D

DSpace@MIT

Recommended from our members

Avoiding pathologies in very deep networks

Author: Adams Ryan Prescott
Duvenaud David
Ghahramani Zoubin
Rippel Oren
Publication venue: Journal of Machine Learning Research
Publication date: 30/10/2017
Field of study

Choosing appropriate architectures and regularization strategies of deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide, deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing infinitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.Engineering and Applied Science

Harvard University - DASH

SCANet

Author: Buthpitiya Senaka
Buthpitiya Senaka
Cooijmans Tim
Diederik
Feng Huan
Gang Zhou
Hadid A.
Hailong Hu
Krizhevsky Alex
Li Lingjun
Li Yantao
Mantyjarvi J.
Rippel Oren
Sandler Mark
Simonyan Karen
Simonyan Karen
Slfre Laurent
Yantao Li
Zhang Xiangyu
Zhangqian Zhu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

3D Tensor Auto-encoder with Application to Video Compression

Author: Agababov Victor
Bader Brett W.
Bengua Johann A.
Bjontegaard G.
Chen Tong
Cheng Zhengxue
Diederik
Han Jun
Hinton Geoffrey E.
Index VN
Kim Sungsoo
Krizhevsky Alex
Rippel Oren
Theis Lucas
Toderici George
van den Oord Aaron
Yang Yimin
Yingzhen Li
Zhang Jia
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref