Search CORE

50 research outputs found

Making Machines Learn. Applications of Cultural Analytics to the Humanities

Author: de la Rosa Pérez Javier
Publication venue: Scholarship@Western
Publication date: 04/02/2016
Field of study

The digitization of several million books by Google in 2011 meant the popularization of a new kind of humanities research powered by the treatment of cultural objects as data. Culturomics, as it is called, was born, and other initiatives resonated with such a methodological approach, as is the case with the recently formed Digital Humanities or Cultural Analytics. Intrinsically, these new quantitative approaches to culture all borrow from techniques and methods developed under the wing of the exact sciences, such as computer science, machine learning or statistics. There are numerous examples of studies that take advantage of the possibilities that treating objects as data has to offer for the understanding of the human. This new data science that is now applied to the current trends in culture can also be replicated to study more traditional humanities. Led by proper intellectual inquiry, an adequate use of technology may bring answers to questions intractable by other means, or add evidence to long held assumptions based on a canon built from few examples. This dissertation argues in favor of such approach. Three different case studies are considered. First, in the more general sense of the big and smart data, we collected and analyzed more than 120,000 pictures of paintings from all periods of art history, to gain a clear insight on how the beauty of depicted faces, in the framework of neuroscience and evolutionary theory, has changed over time. A second study covers the nuances of modes of emotions employed by the Spanish Golden Age playwright Calderón de la Barca to empathize with his audience. By means of sentiment analysis, a technique strongly supported by machine learning, we shed some light into the different fictional characters, and how they interact and convey messages otherwise invisible to the public. The last case is a study of non-traditional authorship attribution techniques applied to the forefather of the modern novel, the Lazarillo de Tormes. In the end, we conclude that the successful application of cultural analytics and computer science techniques to traditional humanistic endeavours has been enriching and validating

Scholarship@Western

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Author
Publication venue: 'OpenEdition'
Publication date: 01/07/2022
Field of study

On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

Directory of Open Access Books (DOAB)

Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media

Author
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/06/2018
Field of study

The IT University of Copenhagen's Repository

Leveraging Longitudinal Data for Personalized Prediction and Word Representations

Author: Welch Charles
Publication venue
Publication date: 01/01/2021
Field of study

This thesis focuses on personalization, word representations, and longitudinal dialog. We first look at users expressions of individual preferences. In this targeted sentiment task, we find that we can improve entity extraction and sentiment classification using domain lexicons and linear term weighting. This task is important to personalization and dialog systems, as targets need to be identified in conversation and personal preferences affect how the system should react. Then we examine individuals with large amounts of personal conversational data in order to better predict what people will say. We consider extra-linguistic features that can be used to predict behavior and to predict the relationship between interlocutors. We show that these features improve over just using message content and that training on personal data leads to much better performance than training on a sample from all other users. We look not just at using personal data for these end-tasks, but also constructing personalized word representations. When we have a lot of data for an individual, we create personalized word embeddings that improve performance on language modeling and authorship attribution. When we have limited data, but we have user demographics, we can instead construct demographic word embeddings. We show that these representations improve language modeling and word association performance. When we do not have demographic information, we show that using a small amount of data from an individual, we can calculate similarity to existing users and interpolate or leverage data from these users to improve language modeling performance. Using these types of personalized word representations, we are able to provide insight into what words vary more across users and demographics. The kind of personalized representations that we introduce in this work allow for applications such as predictive typing, style transfer, and dialog systems. Importantly, they also have the potential to enable more equitable language models, with improved performance for those demographic groups that have little representation in the data.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167971/1/cfwelch_1.pd

Deep Blue Documents at the University of Michigan

Unsupervised Pretraining of Neural Networks with Multiple Targets using Siamese Architectures

Author: Bryan Maximilian
Publication venue
Publication date: 08/10/2021
Field of study

A model's response for a given input pattern depends on the seen patterns in the training data. The larger the amount of training data, the more likely edge cases are covered during training. However, the more complex input patterns are, the larger the model has to be. For very simple use cases, a relatively small model can achieve very high test accuracy in a matter of minutes. On the other hand, a large model has to be trained for multiple days. The actual time to develop a model of that size can be considered to be even greater since often many different architecture types and hyper-parameter configurations have to be tried. An extreme case for a large model is the recently released GPT-3 model. This model consists of 175 billion parameters and was trained using 45 terabytes of text data. The model was trained to generate text and is able to write news articles and source code based only on a rough description. However, a model like this is only creatable for researchers with access to special hardware or immense amounts of data. Thus, it is desirable to find less resource-intensive training approaches to enable other researchers to create well performing models. This thesis investigates the use of pre-trained models. If a model has been trained on one dataset and is then trained on another similar data, it faster learns to adjust to similar patterns than a model that has not yet seen any of the task's pattern. Thus, the learned lessons from one training are transferred to another task. During pre-training, the model is trained to solve a specific task like predicting the next word in a sequence or first encoding an input image before decoding it. Such models contain an encoder and a decoder part. When transferring that model to another task, parts of the model's layers will be removed. As a result, having to discard fewer weights results in faster training since less time has to be spent on training parts of a model that are only needed to solve an auxiliary task. Throughout this thesis, the concept of siamese architectures will be discussed since when using that architecture, no parameters have to be discarded when transferring a model trained with that approach onto another task. Thus, the siamese pre-training approach positively impacts the need for resources like time and energy use and drives the development of new models in the direction of Green AI. The models trained with this approach will be evaluated by comparing them to models trained with other pre-training approaches as well as large existing models. It will be shown that the models trained for the tasks in this thesis perform as good as externally pre-trained models, given the right choice of data and training targets: It will be shown that the number and type of training targets during pre-training impacts a model's performance on transfer learning tasks. The use cases presented in this thesis cover different data from different domains to show that the siamese training approach is widely applicable. Consequently, researchers are motivated to create their own pre-trained models for data domains, for which there are no existing pre-trained models.Die Vorhersage eines Models hängt davon ab, welche Muster in den während des Trainings benutzen Daten vorhanden sind. Je größer die Menge an Trainingsdaten ist, desto wahrscheinlicher ist es, dass Grenzfälle in den Daten vorkommen. Je größer jedoch die Anzahl der zu lernenden Mustern ist, desto größer muss jedoch das Modell sein. Für einfache Anwendungsfälle ist es möglich ein kleines Modell in wenigen Minuten zu trainieren um bereits gute Ergebnisse auf Testdaten zu erhalten. Für komplexe Anwendungsfälle kann ein dementsprechend großes Modell jedoch bis zu mehrere Tage benötigen um ausreichend gut zu sein. Ein Extremfall für ein großes Modell ist das kürzlich veröffentlichte Modell mit dem Namen GPT-3, welches aus 175 Milliarden Parametern besteht und mit Trainingsdaten in der Größenordnung von 45 Terabyte trainiert wurde. Das Modell wurde trainiert Text zu generieren und ist in der Lage Nachrichtenartikel zu generieren, basierend auf einer groben Ausgangsbeschreibung. Solch ein Modell können nur solche Forscher entwickeln, die Zugang zu entsprechender Hardware und Datenmengen haben. Es demnach von Interesse Trainingsvorgehen dahingehend zu verbessern, dass auch mit wenig vorhandenen Ressourcen Modelle für komplexe Anwendungsfälle trainiert werden können. Diese Arbeit beschäfigt sich mit dem Vortrainieren von neuronalen Netzen. Wenn ein neuronales Netz auf einem Datensatz trainiert wurde und dann auf einem zweiten Datensatz weiter trainiert wird, lernt es die Merkmale des zweiten Datensatzes schneller, da es nicht von Grund auf Muster lernen muss sondern auf bereits gelerntes zurückgreifen kann. Man spricht dann davon, dass das Wissen transferiert wird. Während des Vortrainierens bekommt ein Modell häufig eine Aufgabe wie zum Beispiel, im Fall von Bilddaten, die Trainingsdaten erst zu komprimieren und dann wieder herzustellen. Bei Textdaten könnte ein Modell vortrainiert werden, indem es einen Satz als Eingabe erhält und dann den nächsten Satz aus dem Quelldokument vorhersagen muss. Solche Modelle bestehen dementsprechend aus einem Encoder und einem Decoder. Der Nachteil bei diesem Vorgehen ist, dass der Decoder lediglich für das Vortrainieren benötigt wird und für den späteren Anwendungsfall nur der Encoder benötigt wird. Zentraler Bestandteil in dieser Arbeit ist deswegen das Untersuchen der Vorteile und Nachteile der siamesische Modellarchitektur. Diese Architektur besteht lediglich aus einem Encoder, was dazu führt, dass das Vortrainieren kostengünstiger ist, da weniger Gewichte trainiert werden müssen. Der wesentliche wissenschaftliche Beitrag liegt darin, dass die siamische Architektur ausführlich verglichen wird mit vergleichbaren Ansätzen. Dabei werden bestimmte Nachteile gefunden, wie zum Beispiel dass die Auswahl einer Ähnlichkeitsfunktion oder das Zusammenstellen der Trainingsdaten große Auswirkung auf das Modelltraining haben. Es wird erarbeitet, welche Ähnlichkeitsfunktion in welchen Kontexten empfohlen wird sowie wie andere Nachteile der siamischen Architektur durch die Anpassung der Trainingsziele ausgeglichen werden können. Die entsprechenden Experimente werden dabei auf Daten aus unterschiedlichen Domänen ausgeführt um zu zeigen, dass der entsprechende Ansatz universell anwendbar ist. Die Ergebnisse aus konkreten Anwendungsfällen zeigen außerdem, dass die innerhalb dieser Arbeit entwickelten Modelle ähnlich gut abschneiden wie extern verfügbare Modelle, welche mit großem Ressourcenaufwand trainiert worden sind. Dies zeigt, dass mit Bedacht erarbeitete Architekturen die benötigten Ressourcen verringern können

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018 : 10-12 December 2018, Torino

Author: Alessandro Mazzei
Elena Cabrio
Fabio Tamburini
Publication venue: 'OpenEdition'
Publication date: 01/01/2018
Field of study

On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

Directory of Open Access Books (DOAB)

Tune your brown clustering, please

Author: Bøgh K.S.
Chester S.
Derczynski L.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

White Rose Research Online

XVIII. Magyar Számítógépes Nyelvészeti Konferencia

Author
Publication venue: Szegedi Tudományegyetem TTIK Informatikai Intézet
Publication date: 01/01/2022
Field of study

University of Szeged

Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018

Author: Abramova Ekaterina
Adorni Giovanni
Agrawal Ruchit
Aina Laura
Albanese Teresa
Albanesi Davide
Alzetta Chiara
Amore Matteo
Antonelli Oronzo
Aprosio Alessio Palmero
Balaraman Vevake
Basile Pierpaolo
Basile Valerio
Basili Roberto
Bassignana Elisa
Bellandi Andrea
Bentivogli Luisa
Bernardi Raffaella
Bertoldi Nicola
Bondielli Alessandro
Bos Johan
Bosco Cristina
Bottini Roberto
Brunato Dominique
Brunato⋄ Dominique
Buono Maria Pia di
Busso Lucia
Büchler Marco
Cabrio Elena
Caruso Valeria
Caselli Tommaso
Cecchini Flavio
Celli Fabio
Cervone Alessandra
Chesi Cristiano
Chingacham Anupama
Chiriatti Giulia
Cimino Andrea
Cocciu• Eleonora
Colla Davide
Comandini Gloria
Cordeiro Silvio Ricardo
Crepaldi Davide
Croce Danilo
Curtoni Paolo
Cutugno Francesco
dell’Oglio Pietro
Dell’Orletta Felice
Dell’Orletta⋄ Felice
De Felice Irene
De Martino Maria
Dini Luca
Di Iorio Angelo
Di Nunzio Giorgio Maria
Draetta Lia
Ducceschi Luca
Elia Annibale
Falavigna Daniele
Federico Marcello
Feltracco Anna
Fernández Raquel
Ferro Michele
Fieromonte Martina
Franzini Greta
Gagliardi Gloria
Gala Valentina Della
Gambi Enrico
Ghezzi Ilaria
Giovannetti Emiliano
Gobbi Jacopo
Gretter Roberto
Guarasci Raffaele
Guerini Marco
Gurevych Iryna
Günther Fritz
Herzog Leonardo
Jezek Elisabetta
Koceva Forsina
Lai Mirko
Laudanna Alessandro
Lenci Alessandro
Lepri Bruno
Liano Annarita
Limpens Freddy
Louvan Samuel
Lyding Verena
Magnini Bernardo
Magnolini Simone
Mairano Paolo
Mambrini Francesco
Mana Dario
Mancuso Azzurra
Marchi Simone
Marelli Marco
Marini Costanza
Mazzei Alessandro
McGregor Stephen
Melnikova Elena
Menini Stefano
Mensa Enrico
Merenda Flavio
Mollo Eleonora
Montemagni Simonetta
Montemagni⋄ Simonetta
Monti Johanna
Moretti Giovanni
Moritz Maria
Nadalini Andrea
Negri Matteo
Nicolas Lionel
Nissim Malvina
Novielli Nicole
Okinina Nadezda
Pannitto Ludovica
Paperno Denis
Passalacqua Samuele
Passaro Lucia C.
Passarotti Marco
Patti Viviana
Pecchioli Alessandra
Pellegrini Matteo
Petrolito Ruggero
Pettenati Maria Chiara
Piantanida Giovanni
Poggi Isabella
Porporato Aureliano
Quinci Vito
Radicioni Daniele P.
Ramisch Carlos
Rapp Amon
Riccardi Giuseppe
Rossini Daniele
Rotondi Agata
Ruffolo Paolo
Russo Irene
Sagri Maria Teresa
Sangati Federico
Sanguinetti Manuela
Savary Agata
Savy Renata
Simeoni Rossana
Simi Maria
Sorgente Antonio
Speranza Manuela
Sprugnoli Rachele
Stede Manfred
Stepanov Evgeny A.
Stingo Michele
Tamburini Fabio
Tebbifakhr Amirhossein
Tonelli Sara
Torre Ilaria
Tortoreto Giuliano
Totis Pietro
Trotta Daniela
Turchi Marco
Valeriani Martina
Venturi Giulia
Venturi⋄ Giulia
Vezzani Federica
Villata Serena
Vincze Veronika
Zaghi Claudia
Zovato Enrico
Publication venue: 'OpenEdition'
Publication date: 08/04/2019
Field of study

OpenEdition