Search CORE

25 research outputs found

Universal Captioner: Inducing Content-Style Separation in Vision-and-Language Model Training

Author: Baraldi Lorenzo
Cornia Marcella
Cucchiara Rita
Fiameni Giuseppe
Publication venue
Publication date: 29/03/2022
Field of study

While captioning models have obtained compelling results in describing natural images, there is a growing effort to increase their capability of dealing with real-world concepts. In this paper, we address the task of generating fluent descriptions by training on a non-uniform combination of data sources, containing both human- and automatically-collected captions. To this end, we propose a model which induces a separation between content and descriptive style through the incorporation of stylistic parameters and keywords extracted from large-scale multi-modal models as pivotal data. In terms of visual features, our model avoids the need of object detectors and employs grid-like features together with a single objective of prompt language modeling. Experimentally, we consistently outperform existing methods in terms of caption quality and capability of describing out-of-domain concepts. Finally, our model obtains a new state of the art on both COCO and nocaps

arXiv.org e-Print Archive

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

Author: Dennis Fucci
Giuseppe Fiameni
Marco Gaido
Marco Turchi
Matteo Negri
Sara Papi
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2022
Field of study

The primary goal of this FBK’s systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ratio between source and target characters yields a quality improvement of 1 BLEU. Third, we compare different methods to reduce the detrimental effect of the audio segmentation mismatch between training data manually segmented at sentence level and inference data that is automatically segmented. Towards the same goal of training cost reduction, we participate in the simultaneous task with the same model trained for offline ST. The effectiveness of our lightweight training strategy is shown by the high score obtained on the MuST-C en-de corpus (26.7 BLEU) and is confirmed in high-resource data conditions by a 1.6 BLEU improvement on the IWSLT2020 test set over last year’s winning system

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language

Author: Basile Pierpaolo
Fiameni Giuseppe
Musacchio Elio
Polignano Marco
Semeraro Giovanni
Siciliani Lucia
Publication venue
Publication date: 15/12/2023
Field of study

Large Language Models represent state-of-the-art linguistic models designed to equip computers with the ability to comprehend natural language. With its exceptional capacity to capture complex contextual relationships, the LLaMA (Large Language Model Meta AI) family represents a novel advancement in the field of natural language processing by releasing foundational models designed to improve the natural language understanding abilities of the transformer architecture thanks to their large amount of trainable parameters (7, 13, and 70 billion parameters). In many natural language understanding tasks, these models obtain the same performances as private company models such as OpenAI Chat-GPT with the advantage to make publicly available weights and code for research and commercial uses. In this work, we investigate the possibility of Language Adaptation for LLaMA models, explicitly focusing on addressing the challenge of Italian Language coverage. Adopting an open science approach, we explore various tuning approaches to ensure a high-quality text generated in Italian suitable for common tasks in this underrepresented language in the original models' datasets. We aim to release effective text generation models with strong linguistic properties for many tasks that seem challenging using multilingual or general-purpose LLMs. By leveraging an open science philosophy, this study contributes to Language Adaptation strategies for the Italian language by introducing the novel LLaMAntino family of Italian LLMs

arXiv.org e-Print Archive

Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation

Author: Fiameni Giuseppe
Galasso Fabio
Poiesi Fabio
Ricci Elisa
Saltori Cristiano
Sebe Nicu
Publication venue
Publication date: 29/08/2023
Field of study

Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud completion networks. Often, these methods are tailored for range view maps or necessitate multi-modal input. In contrast, domain adaptation in the image domain can be executed through sample mixing, which emphasizes input data manipulation rather than employing distinct adaptation modules. In this study, we introduce compositional semantic mixing for point cloud domain adaptation, representing the first unsupervised domain adaptation technique for point cloud segmentation based on semantic and geometric sample mixing. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world). Each branch operates within one domain by integrating selected data fragments from the other domain and utilizing semantic information derived from source labels and target (pseudo) labels. Additionally, our method can leverage a limited number of human point-level annotations (semi-supervised) to further enhance performance. We assess our approach in both synthetic-to-real and real-to-real scenarios using LiDAR datasets and demonstrate that it significantly outperforms state-of-the-art methods in both unsupervised and semi-supervised settings.Comment: TPAMI. arXiv admin note: text overlap with arXiv:2207.0977

arXiv.org e-Print Archive

Il bilancio integrato per le PMI

Author: Adriano Propersi
Alessandra Tami
Andrea Federico Galimberti
Antonia Di Bella
Antonio Navassa
Cristiana Maria Schena
Cristina Gianfelici
Cristina Rogate
Eros Ambrogio Tavernar
Francesco Randazzo
Giuseppe Chiacchio
Lorenza Guglielmi
Marco Fiameni
Massimo Rizza
Michele Zingarelli
Nicola Mavellia
Paola Spoldi
Raffaella Dall’Anese
Sara Fornasiero.
Publication venue: place:Milano
Publication date: 01/01/2016
Field of study

Accanto ai capitali finanziario e produttivo, ogni impresa fonda il proprio business e il proprio successo anche su risorse intangibili, quali il capitale intellettuale, il capitale umano, il capitale sociale e relazionale ed il capitale naturale. Il tradizionale bilancio economico-finanziario, però, non è adatto a valutare e rappresentare tali risorse, poiché è stato concepito con riferimento ad un’economia industriale fondata pressoché esclusivamente su capitali tangibili; pertanto, anche avuto riguardo alla realtà delle PMI, si rende oggi necessario introdurre nuovi strumenti e nuovi indicatori per la misurazione e la rendicontazione, che siano in grado di cogliere e valorizzare anche le componenti immateriali del capitale aziendale. In questo contesto, il bilancio integrato si pone come una forma evoluta di comunicazione aziendale, finalizzata ad illustrare come strategia, governance, modello di business, rapporti con gli stakeholder, performance passate e prospettive future, rischi e opportunità consentano anche ad un’impresa di piccole e medie dimensioni di creare valore nel breve, medio e lungo termine

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets

Author: Baraldi Lorenzo
Cornia Marcella
Cucchiara Rita
Fiameni Giuseppe
Publication venue
Publication date: 01/01/2024
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Machine learning galaxy properties from 21 cm lightcones: impact of network architectures and signal contamination

Author: Fiameni Giuseppe
Gillet Nicolas
Mesinger Andrei
Murray Steven
Prelogović David
Publication venue: HAL CCSD
Publication date: 02/08/2021
Field of study

Imaging the cosmic 21 cm signal will map out the first billion years of our Universe. The resulting 3D lightcone (LC) will encode the properties of the unseen first galaxies and physical cosmology. Unfortunately, there is no obvious summary statistic to use when interpreting this non-Gaussian image, and the commonly-used power spectrum may waste valuable information. Here we build on previous work using Convolutional Neural Networks (CNNs) to infer astrophysical parameters directly from 21 cm LC images. Guided by the properties of LCs, we combine recurrent layers characterizing evolution along the redshift axis with 2D convolutional layers characterizing local correlations in the sky-plane. Such Recursive Neural Networks (RNNs) are known for efficiently learning temporal correlations in sequential data. Using a large database of simulated cosmic 21 cm LCs, we confirm that RNNs outperform previously-used CNNs in recovering UV and X-ray galaxy properties, reducing the mean squared parameter estimation error by factors of

\sim 2 - 8

. We also corrupt the cosmic signal by adding noise expected from a 1000 h integration with the Square Kilometre Array, as well as excising a foreground-contaminated ''horizon wedge''. Parameter prediction errors increase when the NNs are trained on these contaminated LC images, though recovery is still good even in the most pessimistic case (with

R^2 \ge 0.5 - 0.95

). However, we find no notable differences in performance between network architectures on the contaminated images. We argue this is due to the size of our dataset, highlighting the need for larger datasets and/or better data augmentation in order to maximize the potential of NNs in 21 cm parameter estimation

arXiv.org e-Print Archive

HAL-INSU

LessonAble: Leveraging Deep Fakes in MOOC Content Creation

Author: Fiameni Giuseppe
Gravina Michela
Marrone Stefano
Sannino Ciro
Sansone Carlo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Archivio della ricerca - Università degli studi di Napoli Federico II

A Computational Approach for Progressive Architecture Shrinkage in Action Recognition

Author: Giuseppe Fiameni
Lorenzo Baraldi
Matteo Tomei
Rita Cucchiara
Simone Bronzin
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia