113 research outputs found

    深層学習に基づく画像圧縮と品質評価

    Get PDF
    早大学位記番号:新8427早稲田大

    ビット深度・色域・知覚品質スケーラビリティのための映像符号化手法

    Get PDF
    早大学位記番号:新8421早稲田大

    Audio compression via nonlinear transform coding and stochastic binary activation

    Get PDF
    Engineers have pushed the boundaries of audio compression and designed numerous lossy audio compression codecs, such as ACC, WNA, and others, that have surpassed the longstanding MP3 coding format. However most of the methods are laboriously engineered using psychoacoustic modeling, and some of them are proprietary and only see limited use. This thesis, inspired by recent major breakthroughs in lossy image compression via machine learning methods, explores the possibilities of a neural network trained for lossy audio compression. Currently there are few if any audio compression methods that utilize machine learning. This thesis presents a brief introduction to lossy transform compression and compares it to similar machine learning concepts, then systematically presents a convolutional autoencoder network with a stochastic binary activation for a sparse representation of the code space to achieve compression. A similar network is employed for encoding the residual of the main network. Our network achieves average compression rates of roughly 5 to 2 and introduces few if any audible artifacts, presenting a promising opening to audio compression using machine learning
    corecore