149 research outputs found
Quaternion generative adversarial networks
Latest Generative Adversarial Networks (GANs) are gathering outstanding results through a large-scale training, thus employing models composed of millions of parameters requiring extensive computational capabilities. Building such huge models undermines their replicability and increases the training instability. Moreover, multi-channel data, such as images or audio, are usually processed by real-valued convolutional networks that flatten and concatenate the input, often losing intra-channel spatial relations. To address these issues related to complexity and information loss, we propose a family of quaternion-valued generative adversarial networks (QGANs). QGANs exploit the properties of quaternion algebra, e.g., the Hamilton product, that allows to process channels as a single entity and capture internal latent relations, while reducing by a factor of 4 the overall number of parameters. We show how to design QGANs and to extend the proposed approach even to advanced models. We compare the proposed QGANs with real-valued counterparts on several image generation benchmarks. Results show that QGANs are able to obtain better FID scores than real-valued GANs and to generate visually pleasing images. Furthermore, QGANs save up to 75% of the training parameters. We believe these results may pave the way to novel, more accessible, GANs capable of improving performance and saving computational resources
PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions
Hypercomplex neural networks have proven to reduce the overall number of parameters while ensuring valuable performance by leveraging the properties of Clifford algebras. Recently, hypercomplex linear layers have been further improved by involving efficient parameterized Kronecker products. In this article, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models. Our method grasps the convolution rules and the filter organization directly from data without requiring a rigidly predefined domain structure to follow. PHNNs are flexible to operate in any user-defined or tuned domain, from 1-D to nD regardless of whether the algebra rules are preset. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks (QNNs) for 3-D inputs like color images. As a result, the proposed family of PHNNs operates with 1/n free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image datasets and audio datasets in which our method outperforms real and quaternion-valued counterparts
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation
Spatial audio methods are gaining a growing interest due to the spread of
immersive audio experiences and applications, such as virtual and augmented
reality. For these purposes, 3D audio signals are often acquired through arrays
of Ambisonics microphones, each comprising four capsules that decompose the
sound field in spherical harmonics. In this paper, we propose a dual quaternion
representation of the spatial sound field acquired through an array of two
First Order Ambisonics (FOA) microphones. The audio signals are encapsulated in
a dual quaternion that leverages quaternion algebra properties to exploit
correlations among them. This augmented representation with 6 degrees of
freedom (6DOF) involves a more accurate coverage of the sound field, resulting
in a more precise sound localization and a more immersive audio experience. We
evaluate our approach on a sound event localization and detection (SELD)
benchmark. We show that our dual quaternion SELD model with temporal
convolution blocks (DualQSELD-TCN) achieves better results with respect to real
and quaternion-valued baselines thanks to our augmented representation of the
sound field. Full code is available at:
https://github.com/ispamm/DualQSELD-TCN.Comment: Paper under consideration at Elsevier Pattern Recognition Letter
Enhancing Semantic Communication with Deep Generative Models -- An ICASSP Special Session Overview
Semantic communication is poised to play a pivotal role in shaping the
landscape of future AI-driven communication systems. Its challenge of
extracting semantic information from the original complex content and
regenerating semantically consistent data at the receiver, possibly being
robust to channel corruptions, can be addressed with deep generative models.
This ICASSP special session overview paper discloses the semantic communication
challenges from the machine learning perspective and unveils how deep
generative models will significantly enhance semantic communication frameworks
in dealing with real-world complex data, extracting and exploiting semantic
information, and being robust to channel corruptions. Alongside establishing
this emerging field, this paper charts novel research pathways for the next
generative semantic communication frameworks.Comment: Submitted to IEEE ICASS
Hypercomplex Image-to-Image Translation
Image-to-image translation (I2I) aims at transferring the content
representation from an input domain to an output one, bouncing along different
target domains. Recent I2I generative models, which gain outstanding results in
this task, comprise a set of diverse deep networks each with tens of million
parameters. Moreover, images are usually three-dimensional being composed of
RGB channels and common neural models do not take dimensions correlation into
account, losing beneficial information. In this paper, we propose to leverage
hypercomplex algebra properties to define lightweight I2I generative models
capable of preserving pre-existing relations among image dimensions, thus
exploiting additional input information. On manifold I2I benchmarks, we show
how the proposed Quaternion StarGANv2 and parameterized hypercomplex StarGANv2
(PHStarGANv2) reduce parameters and storage memory amount while ensuring high
domain translation performance and good image quality as measured by FID and
LPIPS scores. Full code is available at: https://github.com/ispamm/HI2I
Diffusion models for audio semantic communication
Directly sending audio signals from a transmitter to a receiver across a
noisy channel may absorb consistent bandwidth and be prone to errors when
trying to recover the transmitted bits. On the contrary, the recent semantic
communication approach proposes to send the semantics and then regenerate
semantically consistent content at the receiver without exactly recovering the
bitstream. In this paper, we propose a generative audio semantic communication
framework that faces the communication problem as an inverse problem, therefore
being robust to different corruptions. Our method transmits lower-dimensional
representations of the audio signal and of the associated semantics to the
receiver, which generates the corresponding signal with a particular focus on
its meaning (i.e., the semantics) thanks to the conditional diffusion model at
its core. During the generation process, the diffusion model restores the
received information from multiple degradations at the same time including
corruption noise and missing parts caused by the transmission over the noisy
channel. We show that our framework outperforms competitors in a real-world
scenario and with different channel conditions. Visit the project page to
listen to samples and access the code:
https://ispamm.github.io/diffusion-audio-semantic-communication/.Comment: Submitted to IEEE ICASSP 202
Macromolecular structural dynamics visualized by pulsed dose control in 4D electron microscopy
Macromolecular conformation dynamics, which span a wide range of time scales, are fundamental to the understanding of properties and functions of their structures. Here, we report direct imaging of structural dynamics of helical macromolecules over the time scales of conformational dynamics (ns to subsecond) by means of four-dimensional (4D) electron microscopy in the single-pulse and stroboscopic modes. With temporally controlled electron dosage, both diffraction and real-space images are obtained without irreversible radiation damage. In this way, the order-disorder transition is revealed for the organic chain polymer. Through a series of equilibrium-temperature and temperature-jump dependencies, it is shown that the metastable structures and entropy of conformations can be mapped in the nonequilibrium region of a “funnel-like” free-energy landscape. The T-jump is introduced through a substrate (a “hot plate” type arrangement) because only the substrate is made to absorb the pulsed energy. These results illustrate the promise of ultrafast 4D imaging for other applications in the study of polymer physics as well as in the visualization of biological phenomena
Semantic Communications Based on Adaptive Generative Models and Information Bottleneck
Semantic communications represent a significant breakthrough with respect to
the current communication paradigm, as they focus on recovering the meaning
behind the transmitted sequence of symbols, rather than the symbols themselves.
In semantic communications, the scope of the destination is not to recover a
list of symbols symbolically identical to the transmitted ones, but rather to
recover a message that is semantically equivalent to the semantic message
emitted by the source. This paradigm shift introduces many degrees of freedom
to the encoding and decoding rules that can be exploited to make the design of
communication systems much more efficient. In this paper, we present an
approach to semantic communication building on three fundamental ideas: 1)
represent data over a topological space as a formal way to capture semantics,
as expressed through relations; 2) use the information bottleneck principle as
a way to identify relevant information and adapt the information bottleneck
online, as a function of the wireless channel state, in order to strike an
optimal trade-off between transmit power, reconstruction accuracy and delay; 3)
exploit probabilistic generative models as a general tool to adapt the
transmission rate to the wireless channel state and make possible the
regeneration of the transmitted images or run classification tasks at the
receiver side.Comment: To appear on IEEE Communications Magazine, special issue on Semantic
Communications: Transmission beyond Shannon, 202
Smoking E-CigaRette and HEat-noT-burn Products: the SECRHET study, a large observational survey among young people in Italy
Aim: Electronic cigarettes (eCig) and heated tobacco products (HTP), that heat a solution (e-liquid) to create vapour and tobacco at a temperature below the point of combustion, respectively, are emerging forms of smoking device widely diffused. The aim of this study was to investigate knowledge, attitudes and behaviour toward HTP among young people in Italy.
Methods: The Smoking E-CigaRette and HEat-noT-burn products (SECRHET) study was an online survey carried out in April 2019 using the platform Skuola.net, a platform where 2.5 million students are registered. Questions were related to knowledge about new generation smoking products, such as “Do you know what happens to tobacco when you use a heat-not-burn product?”, “Do you think electronic cigarettes create addiction?”, “Are products that use heated tobacco harmful to health?”, “Are electronic cigarettes harmful to health?”, “Have you ever heard of products that use heated tobacco?”, “Is nicotine present in products that use heated tobacco?”
Results: A total of 13882 people completed the questionnaire, of which 8056 (58%) were females. Regarding smoking habits, 3393 (24.4%) declared to be current cigarette smokers, while 802 (5.8%) and 3173 (22.9%) were current and former e-cigarette smokers, respectively. Moreover, 715 (5.2%) and 1148 (8.3%) declared to be current and former heat-not-burn cigarette smokers. The variables associated to both eCig and HTP use were current smoking, age over 18 years, male gender, and residence in Central and Southern Regions. Concerning knowledge issues, almost half of respondents believe that electronic cigarettes are addictive and are harmful to health. Moreover, most of respondents do not know what happens to tobacco when using a heated tobacco device and if heated tobacco products are harmful to health.
Conclusion: The prevalence of eCig and HTP use is higher among young people in Italy compared to adults and older people, and requires adequate public health interventions.
 
How much do young Italians know about COVID-19 and what are their attitudes toward SARS-CoV-2? Results of a cross-sectional study
Objectives:
At the end of 2019, an outbreak of novel coronavirus pneumonia, called severe acute respiratory syndrome coronavirus 1 (SARS-CoV-2), was first identified in Wuhan, Hubei Province, China. It subsequently spread throughout China and elsewhere, becoming a global health emergency. In February 2020, the World Health Organization (WHO) designated the disease coronavirus disease 2019 (COVID-19). The objective of this study was to investigate the degree of knowledge of young Italians about COVID-19 and their current attitudes toward the SARS-CoV-2 and to determine if there were prejudices emerging toward Chinese.
Methods:
An online survey was conducted on February 3, 4, 5, 2020, with the collaboration of Italian website “Skuola.net”. Young people had the opportunity to participate by answering an ad hoc questionnaire created to investigate knowledge and attitudes about the new coronavirus, using a link published on the homepage.
Results:
A total of 5234 responses were received, of which 3262 were females and 1972 were males. Most of the participants showed generally moderate knowledge about COVID-19. Male students, middle school students, and those who do not attend school, should increase awareness of the disease; less than half of responders say that their attitudes toward the Chinese population has worsened in the last period
- …