Search CORE

146 research outputs found

The AEP algorithm for the fast computation of the distribution of the sum of dependent random variables

Author: Arbenz Philipp
Embrechts Paul
Puccetti Giovanni
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/05/2011
Field of study

We propose a new algorithm to compute numerically the distribution function of the sum of

d

dependent, non-negative random variables with given joint distribution.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ284 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

AIR Universita degli studi di Milano

Transfomer Models: From Model Inspection to Applications in Patents

Author: PUCCETTI Giovanni
Publication venue: Scuola Normale Superiore
Publication date: 07/11/2023
Field of study

L'elaborazione del linguaggio naturale viene utilizzata per affrontare diversi compiti, sia di tipo linguistico, come ad esempio l'etichettatura della parte del discorso, il parsing delle dipendenze, sia più specifiche, come ad esempio la traduzione automatica e l'analisi del sentimento. Per affrontare questi compiti, nel tempo sono stati sviluppati approcci dedicati.Una metodologia che aumenta le prestazioni in tutti questi casi in modo unificato è la modellazione linguistica, che consiste nel preaddestrare un modello per sostituire i token mascherati in grandi quantità di testo, in modo casuale all'interno di pezzi di testo o in modo sequenziale uno dopo l'altro, per sviluppare rappresentazioni di uso generale che possono essere utilizzate per migliorare le prestazioni in molti compiti contemporaneamente.L'architettura di rete neurale che attualmente svolge al meglio questo compito è il transformer, inoltre, le dimensioni del modello e la quantità dei dati sono essenziali per lo sviluppo di rappresentazioni ricche di informazioni. La disponibilità di insiemi di dati su larga scala e l'uso di modelli con miliardi di parametri sono attualmente il percorso più efficace verso una migliore rappresentazione del testo.Tuttavia, i modelli di grandi dimensioni comportano una maggiore difficoltà nell'interpretazione dell'output che forniscono. Per questo motivo, sono stati condotti diversi studi per indagare le rappresentazioni fornite da modelli di transformers.In questa tesi indago questi modelli da diversi punti di vista, studiando le proprietà linguistiche delle rappresentazioni fornite da BERT, per capire se le informazioni che codifica sono localizzate all'interno di specifiche elementi della rappresentazione vettoriale. A tal fine, identifico pesi speciali che mostrano un'elevata rilevanza per diversi compiti di sondaggio linguistico. In seguito, analizzo la causa di questi particolari pesi e li collego alla distribuzione dei token e ai token speciali.Per completare questa analisi generale ed estenderla a casi d'uso più specifici, studio l'efficacia di questi modelli sui brevetti. Utilizzo modelli dedicati, per identificare entità specifiche del dominio, come le tecnologie o per segmentare il testo dei brevetti. Studio sempre l'analisi delle prestazioni integrandola con accurate misurazioni dei dati e delle proprietà del modello per capire se le conclusioni tratte per i modelli generici valgono anche in questo contesto.Natural Language Processing is used to address several tasks, linguistic related ones, e.g. part of speech tagging, dependency parsing, and downstream tasks, e.g. machine translation, sentiment analysis. To tackle these tasks, dedicated approaches have been developed over time.A methodology that increases performance on all tasks in a unified manner is language modeling, this is done by pre-training a model to replace masked tokens in large amounts of text, either randomly within chunks of text or sequentially one after the other, to develop general purpose representations that can be used to improve performance in many downstream tasks at once.The neural network architecture currently best performing this task is the transformer, moreover, model size and data scale are essential to the development of information-rich representations. The availability of large scale datasets and the use of models with billions of parameters is currently the most effective path towards better representations of text.However, with large models, comes the difficulty in interpreting the output they provide. Therefore, several studies have been carried out to investigate the representations provided by transformers models trained on large scale datasets.In this thesis I investigate these models from several perspectives, I study the linguistic properties of the representations provided by BERT, a language model mostly trained on the English Wikipedia, to understand if the information it codifies is localized within specific entries of the vector representation. Doing this I identify special weights that show high relevance to several distinct linguistic probing tasks. Subsequently, I investigate the cause of these special weights, and link them to token distribution and special tokens.To complement this general purpose analysis and extend it to more specific use cases, given the wide range of applications for language models, I study their effectiveness on technical documentation, specifically, patents. I use both general purpose and dedicated models, to identify domain-specific entities such as users of the inventions and technologies or to segment patents text. I always study performance analysis complementing it with careful measurements of data and model properties to understand if the conclusions drawn for general purpose models hold in this context as well

Archivio istituzionale della Ricerca - Scuola Normale Superiore

An interview with Giorgio Dall'Aglio

Author: Giovanni Puccetti
Publication venue
Publication date: 01/11/2016
Field of study

In the fourth interview of the series, Dependence Modeling presents a conversation with Giorgio Dall'Aglio, an Italian mathematician and probabilist who is internationally acknowledged as one of the main contributors to the theory of Distributions with Given Marginals. In addition to describing his career path and his achievements in mathematics and probability, Giorgio Dall'Aglio portraits the several milestone mathematicians he met during his long career. In the following, our questions to Giorgio Dall'Aglio are typeset in bold-face

Open Access Repository

Computation of sharp bounds on the distribution of a function of dependent risks

Author: Puccetti Giovanni
Rüschendorf Ludger
Publication venue: Elsevier B.V.
Publication date: 01/01/2012
Field of study

AbstractWe propose a new algorithm to compute numerically sharp lower and upper bounds on the distribution of a function of d dependent random variables having fixed marginal distributions. Compared to the existing literature, the bounds are widely applicable, more accurate and more easily obtained

Elsevier - Publisher Connector

AIR Universita degli studi di Milano

Quantum-spacetime scenarios and soft spectral lags of the remarkable GRB130427A

Author: Amelino-Camelia Giovanni
Fiore Fabrizio
Guetta Dafne
Puccetti Simonetta
Publication venue
Publication date: 23/05/2013
Field of study

We process the Fermi LAT data on GRB130427A using the Fermi Science Tools, and we summarize some of the key facts that render this observation truly remarkable, especially concerning the quality of information on high-energy emission by GRBs. We then exploit this richness for a search of spectral lags, of the type that has been recently of interest for its relevance in quantum-spacetime research. We do find some evidence of systematic soft spectral lags: when confining the analysis to photons of energies greater than 5 GeV there is an early hard development of minibursts within this long burst. The effect turns out to be well characterized quantitatively by a linear dependence, within such a miniburst, of the detection time on energy. With the guidance of our findings for GRB130427A we can then recognize that some support for these features is noticeable also in earlier Fermi-LAT GRBs, particularly for the presence of hard minibursts whose onset is marked by the highest-energy photon observed for the GRB. A comparison of these features for GRBs at different redshifts provides some encouragement for a redshift dependence of the effects of the type expected for a quantum-spacetime interpretation, but other aspects of the analysis appear to invite the interpretation as intrinsic properties of GRBs

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Computation of sharp bounds on the distribution of a function of dependent risks

Author: Puccetti Giovanni
Rüschendorf Ludger
Publication venue: Elsevier B.V.
Publication date: 01/01/1988
Field of study

Elsevier - Publisher Connector

Crossref

AIR Universita degli studi di Milano

Repositori Obert de Coneixement de l'Ajuntament de Barcelona

How Do BERT embeddings organize linguistic knowledge?

Author: Dell'Orletta Felice
Miaschi Alessio
Puccetti Giovanni
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2021
Field of study

Several studies investigated the linguistic information implicitly encoded in Neural Language Models. Most of these works focused on quantifying the amount and type of information available within their internal representations and across their layers. In line with this scenario, we proposed a different study, based on Lasso regression, aimed at understanding how the information encoded by BERT sentence-level representations is arranged within its hidden units. Using a suite of several probing tasks, we showed the existence of a relationship between the implicit knowledge learned by the model and the number of individual units involved in the encodings of this competence. Moreover, we found that it is possible to identify groups of hidden units more relevant for specific linguistic properties. © 2021 Association for Computational Linguistics

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Open Access Repository