96,222 research outputs found

    Recurrent Models of Visual Attention

    Full text link
    Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it performs can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so

    Optimizing expected word error rate via sampling for speech recognition

    Full text link
    State-level minimum Bayes risk (sMBR) training has become the de facto standard for sequence-level training of speech recognition acoustic models. It has an elegant formulation using the expectation semiring, and gives large improvements in word error rate (WER) over models trained solely using cross-entropy (CE) or connectionist temporal classification (CTC). sMBR training optimizes the expected number of frames at which the reference and hypothesized acoustic states differ. It may be preferable to optimize the expected WER, but WER does not interact well with the expectation semiring, and previous approaches based on computing expected WER exactly involve expanding the lattices used during training. In this paper we show how to perform optimization of the expected WER by sampling paths from the lattices used during conventional sMBR training. The gradient of the expected WER is itself an expectation, and so may be approximated using Monte Carlo sampling. We show experimentally that optimizing WER during acoustic model training gives 5% relative improvement in WER over a well-tuned sMBR baseline on a 2-channel query recognition task (Google Home)

    Deep Learning: Our Miraculous Year 1990-1991

    Full text link
    In 2020, we will celebrate that many of the basic ideas behind the deep learning revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich. Back then, few people were interested, but a quarter century later, neural networks based on these ideas were on over 3 billion devices such as smartphones, and used many billions of times per day, consuming a significant fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201

    Glimpses of the Octonions and Quaternions History and Todays Applications in Quantum Physics

    Full text link
    Before we dive into the accessibility stream of nowadays indicatory applications of octonions to computer and other sciences and to quantum physics let us focus for a while on the crucially relevant events for todays revival on interest to nonassociativity. Our reflections keep wandering back to the BrahmaguptaBrahmagupta FibonaccFibonacc two square identity and then via the EulerEuler four square identity up to the DegenDegen GgravesGgraves CayleyCayley eight square identity. These glimpses of history incline and invite us to retell the story on how about one month after quaternions have been carved on the BroughamianBroughamian bridge octonions were discovered by JohnJohn ThomasThomas GgravesGgraves, jurist and mathematician, a friend of WilliamWilliam RowanRowan HamiltonHamilton. As for today we just mention en passant quaternionic and octonionic quantum mechanics, generalization of CauchyCauchy RiemannRiemann equations for octonions and triality principle and G2G_2 group in spinor language in a descriptive way in order not to daunt non specialists. Relation to finite geometries is recalled and the links to the 7stones of seven sphere, seven imaginary octonions units in out of the PlatoPlato cave reality applications are appointed . This way we are welcomed back to primary ideas of HeisenbergHeisenberg, WheelerWheeler and other distinguished fathers of quantum mechanics and quantum gravity foundations.Comment: 26 pages, 7 figure
    corecore