149 research outputs found

    Quaternion generative adversarial networks

    Get PDF
    Latest Generative Adversarial Networks (GANs) are gathering outstanding results through a large-scale training, thus employing models composed of millions of parameters requiring extensive computational capabilities. Building such huge models undermines their replicability and increases the training instability. Moreover, multi-channel data, such as images or audio, are usually processed by real-valued convolutional networks that flatten and concatenate the input, often losing intra-channel spatial relations. To address these issues related to complexity and information loss, we propose a family of quaternion-valued generative adversarial networks (QGANs). QGANs exploit the properties of quaternion algebra, e.g., the Hamilton product, that allows to process channels as a single entity and capture internal latent relations, while reducing by a factor of 4 the overall number of parameters. We show how to design QGANs and to extend the proposed approach even to advanced models. We compare the proposed QGANs with real-valued counterparts on several image generation benchmarks. Results show that QGANs are able to obtain better FID scores than real-valued GANs and to generate visually pleasing images. Furthermore, QGANs save up to 75% of the training parameters. We believe these results may pave the way to novel, more accessible, GANs capable of improving performance and saving computational resources

    PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions

    Get PDF
    Hypercomplex neural networks have proven to reduce the overall number of parameters while ensuring valuable performance by leveraging the properties of Clifford algebras. Recently, hypercomplex linear layers have been further improved by involving efficient parameterized Kronecker products. In this article, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models. Our method grasps the convolution rules and the filter organization directly from data without requiring a rigidly predefined domain structure to follow. PHNNs are flexible to operate in any user-defined or tuned domain, from 1-D to nD regardless of whether the algebra rules are preset. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks (QNNs) for 3-D inputs like color images. As a result, the proposed family of PHNNs operates with 1/n free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image datasets and audio datasets in which our method outperforms real and quaternion-valued counterparts

    Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation

    Full text link
    Spatial audio methods are gaining a growing interest due to the spread of immersive audio experiences and applications, such as virtual and augmented reality. For these purposes, 3D audio signals are often acquired through arrays of Ambisonics microphones, each comprising four capsules that decompose the sound field in spherical harmonics. In this paper, we propose a dual quaternion representation of the spatial sound field acquired through an array of two First Order Ambisonics (FOA) microphones. The audio signals are encapsulated in a dual quaternion that leverages quaternion algebra properties to exploit correlations among them. This augmented representation with 6 degrees of freedom (6DOF) involves a more accurate coverage of the sound field, resulting in a more precise sound localization and a more immersive audio experience. We evaluate our approach on a sound event localization and detection (SELD) benchmark. We show that our dual quaternion SELD model with temporal convolution blocks (DualQSELD-TCN) achieves better results with respect to real and quaternion-valued baselines thanks to our augmented representation of the sound field. Full code is available at: https://github.com/ispamm/DualQSELD-TCN.Comment: Paper under consideration at Elsevier Pattern Recognition Letter

    Enhancing Semantic Communication with Deep Generative Models -- An ICASSP Special Session Overview

    Full text link
    Semantic communication is poised to play a pivotal role in shaping the landscape of future AI-driven communication systems. Its challenge of extracting semantic information from the original complex content and regenerating semantically consistent data at the receiver, possibly being robust to channel corruptions, can be addressed with deep generative models. This ICASSP special session overview paper discloses the semantic communication challenges from the machine learning perspective and unveils how deep generative models will significantly enhance semantic communication frameworks in dealing with real-world complex data, extracting and exploiting semantic information, and being robust to channel corruptions. Alongside establishing this emerging field, this paper charts novel research pathways for the next generative semantic communication frameworks.Comment: Submitted to IEEE ICASS

    Hypercomplex Image-to-Image Translation

    Get PDF
    Image-to-image translation (I2I) aims at transferring the content representation from an input domain to an output one, bouncing along different target domains. Recent I2I generative models, which gain outstanding results in this task, comprise a set of diverse deep networks each with tens of million parameters. Moreover, images are usually three-dimensional being composed of RGB channels and common neural models do not take dimensions correlation into account, losing beneficial information. In this paper, we propose to leverage hypercomplex algebra properties to define lightweight I2I generative models capable of preserving pre-existing relations among image dimensions, thus exploiting additional input information. On manifold I2I benchmarks, we show how the proposed Quaternion StarGANv2 and parameterized hypercomplex StarGANv2 (PHStarGANv2) reduce parameters and storage memory amount while ensuring high domain translation performance and good image quality as measured by FID and LPIPS scores. Full code is available at: https://github.com/ispamm/HI2I

    Diffusion models for audio semantic communication

    Full text link
    Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication approach proposes to send the semantics and then regenerate semantically consistent content at the receiver without exactly recovering the bitstream. In this paper, we propose a generative audio semantic communication framework that faces the communication problem as an inverse problem, therefore being robust to different corruptions. Our method transmits lower-dimensional representations of the audio signal and of the associated semantics to the receiver, which generates the corresponding signal with a particular focus on its meaning (i.e., the semantics) thanks to the conditional diffusion model at its core. During the generation process, the diffusion model restores the received information from multiple degradations at the same time including corruption noise and missing parts caused by the transmission over the noisy channel. We show that our framework outperforms competitors in a real-world scenario and with different channel conditions. Visit the project page to listen to samples and access the code: https://ispamm.github.io/diffusion-audio-semantic-communication/.Comment: Submitted to IEEE ICASSP 202

    Macromolecular structural dynamics visualized by pulsed dose control in 4D electron microscopy

    Get PDF
    Macromolecular conformation dynamics, which span a wide range of time scales, are fundamental to the understanding of properties and functions of their structures. Here, we report direct imaging of structural dynamics of helical macromolecules over the time scales of conformational dynamics (ns to subsecond) by means of four-dimensional (4D) electron microscopy in the single-pulse and stroboscopic modes. With temporally controlled electron dosage, both diffraction and real-space images are obtained without irreversible radiation damage. In this way, the order-disorder transition is revealed for the organic chain polymer. Through a series of equilibrium-temperature and temperature-jump dependencies, it is shown that the metastable structures and entropy of conformations can be mapped in the nonequilibrium region of a “funnel-like” free-energy landscape. The T-jump is introduced through a substrate (a “hot plate” type arrangement) because only the substrate is made to absorb the pulsed energy. These results illustrate the promise of ultrafast 4D imaging for other applications in the study of polymer physics as well as in the visualization of biological phenomena

    Semantic Communications Based on Adaptive Generative Models and Information Bottleneck

    Full text link
    Semantic communications represent a significant breakthrough with respect to the current communication paradigm, as they focus on recovering the meaning behind the transmitted sequence of symbols, rather than the symbols themselves. In semantic communications, the scope of the destination is not to recover a list of symbols symbolically identical to the transmitted ones, but rather to recover a message that is semantically equivalent to the semantic message emitted by the source. This paradigm shift introduces many degrees of freedom to the encoding and decoding rules that can be exploited to make the design of communication systems much more efficient. In this paper, we present an approach to semantic communication building on three fundamental ideas: 1) represent data over a topological space as a formal way to capture semantics, as expressed through relations; 2) use the information bottleneck principle as a way to identify relevant information and adapt the information bottleneck online, as a function of the wireless channel state, in order to strike an optimal trade-off between transmit power, reconstruction accuracy and delay; 3) exploit probabilistic generative models as a general tool to adapt the transmission rate to the wireless channel state and make possible the regeneration of the transmitted images or run classification tasks at the receiver side.Comment: To appear on IEEE Communications Magazine, special issue on Semantic Communications: Transmission beyond Shannon, 202

    Smoking E-CigaRette and HEat-noT-burn Products: the SECRHET study, a large observational survey among young people in Italy

    Get PDF
    Aim: Electronic cigarettes (eCig) and heated tobacco products (HTP), that heat a solution (e-liquid) to create vapour and tobacco at a temperature below the point of combustion, respectively, are emerging forms of smoking device widely diffused. The aim of this study was to investigate knowledge, attitudes and behaviour toward HTP among young people in Italy. Methods: The Smoking E-CigaRette and HEat-noT-burn products (SECRHET) study was an online survey carried out in April 2019 using the platform Skuola.net, a platform where 2.5 million students are registered. Questions were related to knowledge about new generation smoking products, such as “Do you know what happens to tobacco when you use a heat-not-burn product?”, “Do you think electronic cigarettes create addiction?”, “Are products that use heated tobacco harmful to health?”, “Are electronic cigarettes harmful to health?”, “Have you ever heard of products that use heated tobacco?”, “Is nicotine present in products that use heated tobacco?”  Results: A total of 13882 people completed the questionnaire, of which 8056 (58%) were females. Regarding smoking habits, 3393 (24.4%) declared to be current cigarette smokers, while 802 (5.8%) and 3173 (22.9%) were current and former e-cigarette smokers, respectively. Moreover, 715 (5.2%) and 1148 (8.3%) declared to be current and former heat-not-burn cigarette smokers. The variables associated to both eCig and HTP use were current smoking, age over 18 years, male gender, and residence in Central and Southern Regions. Concerning knowledge issues, almost half of respondents believe that electronic cigarettes are addictive and are harmful to health. Moreover, most of respondents do not know what happens to tobacco when using a heated tobacco device and if heated tobacco products are harmful to health. Conclusion: The prevalence of eCig and HTP use is higher among young people in Italy compared to adults and older people, and requires adequate public health interventions. &nbsp

    How much do young Italians know about COVID-19 and what are their attitudes toward SARS-CoV-2? Results of a cross-sectional study

    Get PDF
    Objectives: At the end of 2019, an outbreak of novel coronavirus pneumonia, called severe acute respiratory syndrome coronavirus 1 (SARS-CoV-2), was first identified in Wuhan, Hubei Province, China. It subsequently spread throughout China and elsewhere, becoming a global health emergency. In February 2020, the World Health Organization (WHO) designated the disease coronavirus disease 2019 (COVID-19). The objective of this study was to investigate the degree of knowledge of young Italians about COVID-19 and their current attitudes toward the SARS-CoV-2 and to determine if there were prejudices emerging toward Chinese. Methods: An online survey was conducted on February 3, 4, 5, 2020, with the collaboration of Italian website “Skuola.net”. Young people had the opportunity to participate by answering an ad hoc questionnaire created to investigate knowledge and attitudes about the new coronavirus, using a link published on the homepage. Results: A total of 5234 responses were received, of which 3262 were females and 1972 were males. Most of the participants showed generally moderate knowledge about COVID-19. Male students, middle school students, and those who do not attend school, should increase awareness of the disease; less than half of responders say that their attitudes toward the Chinese population has worsened in the last period
    corecore