Search CORE

263 research outputs found

Sound-to-imagination: an exploratory study on cross-modal translation using diverse audiovisual data

Author: Areias Fanzeres Leonardo
Nadeu Camprubí Climent
Publication venue: Multidisciplinary Digital Publishing Institute
Publication date: 29/09/2023
Field of study

The motivation of our research is to explore the possibilities of automatic sound-to-image (S2I) translation for enabling a human receiver to visually infer occurrences of sound-related events. We expect the computer to ‘imagine’ scenes from captured sounds, generating original images that depict the sound-emitting sources. Previous studies on similar topics opted for simplified approaches using data with low content diversity and/or supervision/self-supervision for training. In contrast, our approach involves performing S2I translation using thousands of distinct and unknown scenes, using sound class annotations solely for data preparation, just enough to ensure aural–visual semantic coherence. To model the translator, we employ an audio encoder and a conditional generative adversarial network (GAN) with a deep densely connected generator. Furthermore, we present a solution using informativity classifiers for quantitatively evaluating the generated images. This allows us to analyze the influence of network-bottleneck variation on the translation process, highlighting a potential trade-off between informativity and pixel space convergence. Despite the complexity of the specified S2I translation task, we were able to generalize the model enough to obtain more than 14%, on average, of interpretable and semantically coherent images translated from unknown sounds.The present work was supported in part by the Brazilian National Council for Scientific and Technological Development (CNPq) under PhD grant 200884/2015-8. Also, the work was partly supported by the Spanish State Research Agency (AEI), project PID2019-107579RBI00/AEI/10.13039/501100011033.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Extreme Situations Prediction by MultidimenSional Heterogeneous Time Series Using Logical Decision Functions

Author: Nedel’ko Svetlana
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2006
Field of study

* The work is supported by RFBR, grant 04-01-00858-aA method for prediction of multidimensional heterogeneous time series using logical decision functions is suggested. The method implements simultaneous prediction of several goal variables. It uses deciding function construction algorithm that performs directed search of some variable space partitioning in class of logical deciding functions. To estimate a deciding function quality the realization of informativity criterion for conditional distribution in goal variables' space is offered. As an indicator of extreme states, an occurrence a transition with small probability is suggested

Bulgarian Digital Mathematics Library at IMI-BAS

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models

Author: Chen Xu
Huang Weilin
Ju Chen
Li Zeqian
Wang Haicheng
Xiao Shuai
Zhai Zhonghua
Publication venue
Publication date: 12/12/2023
Field of study

Vision-Language Large Models (VLMs) have become primary backbone of AI, due to the impressive performance. However, their expensive computation costs, i.e., throughput and delay, impede potentials in real-world scenarios. To achieve acceleration for VLMs, most existing methods focus on the model perspective: pruning, distillation, quantification, but completely overlook the data-perspective redundancy. To fill the overlook, this paper pioneers the severity of data redundancy, and designs one plug-and-play Turbo module guided by information degree to prune inefficient tokens from visual or textual data. In pursuit of efficiency-performance trade-offs, information degree takes two key factors into consideration: mutual redundancy and semantic value. Concretely, the former evaluates the data duplication between sequential tokens; while the latter evaluates each token by its contribution to the overall semantics. As a result, tokens with high information degree carry less redundancy and stronger semantics. For VLMs' calculation, Turbo works as a user-friendly plug-in that sorts data referring to information degree, utilizing only top-level ones to save costs. Its advantages are multifaceted, e.g., being generally compatible to various VLMs across understanding and generation, simple use without retraining and trivial engineering efforts. On multiple public VLMs benchmarks, we conduct extensive experiments to reveal the gratifying acceleration of Turbo, under negligible performance drop

arXiv.org e-Print Archive

Linear Order in Language:an Error-Driven Learning Account

Author: Hoppe Dorothée
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2022
Field of study

Learners of German often struggle with learning the grammatical gender of nouns and their correct articles, for example, that it should be “die Gabel” (the fork) and not “der Gabel”. Why is this so hard? And why do gender systems even exist?I taught participants differently structured artificial languages and found that it is especially difficult to learn a gender system, when gender is marked before the noun (e.g., in German: “die Gabel”, the fork, vs. “der Löffel”, the spoon) as compared to when gender is marked after the noun (e.g., in Albanian: “pirun-i”, the fork, vs. “lug-a”, the spoon). With computational simulations I could show that this effect arises because human learning is sensitive to the order of words.However, while gendered articles are hard to learn, they can facilitate communication because they can make following nouns more predictable and therefore easier to process: for example, after the German article “der”, “Löffel” is quite likely, “Gabel”, however, is very unlikely to follow. This is a function that gendered suffixes, as in Albanian, or genderless articles, as in English, cannot fulfill. In a language production study, I observed that speakers produce more articles that can make following nouns predictable, such as German articles, than articles that cannot fulfill this function, such as the English article “the”.I conclude that the order in which gender is marked in languages affects language learning as well as communication. This makes German gender hard to learn but useful for communication

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Towards a text-linguistic definition of Qur'anic inimitability : a discourse perspective and problems of translation

Author: Al-Halawani Sami
Publication venue: Management and Languages
Publication date: 01/01/2003
Field of study

Abstract unavailable please refer to PD

ROS: The Research Output Service. Heriot-Watt University Edinburgh

OpenGrey Repository

A guide to learning modules in a dynamic network

Author: Ramaswamy Karthik Raghavan
Publication venue: Eindhoven University of Technology
Publication date: 03/05/2022
Field of study

Pure OAI Repository

A guide to learning modules in a dynamic network

Author: Ramaswamy Karthik Raghavan
Publication venue: Eindhoven University of Technology
Publication date: 03/05/2022
Field of study

Pure OAI Repository

Cost aware Inference for IoT Devices

Author: Saligrama Venkatesh
Publication venue
Publication date: 01/04/2019
Field of study

Networked embedded devices (IoTs) of limitedCPU, memory and power resources are revo-lutionizing data gathering, remote monitoringand planning in many consumer and businessapplications. Nevertheless, resource limita-tions place a significant burden on their ser-vice life and operation, warranting cost-awaremethods that are capable of distributivelyscreening redundancies in device informationand transmitting informative data. We pro-pose to train a decentralized gated networkthat, given an observed instance at test-time,allows for activation of select devices to trans-mit information to a central node, which thenperforms inference. We analyze our proposedgradient descent algorithm for Gaussian fea-tures and establish convergence guaranteesunder good initialization. We conduct exper-iments on a number of real-world datasetsarising in IoT applications and show that ourmodel results in over 1.5X service life withnegligible accuracy degradation relative to aperformance achievable by a neural network.http://proceedings.mlr.press/v89/zhu19d/zhu19d.pdfPublished versio

Boston University Institutional Repository (OpenBU)