Search CORE

10 research outputs found

Structured Descriptions of Roles, Activities,and Procedures in the Roman Constitution

Author: Allen Robert B.
Chu Yoonmi
Publication venue
Publication date: 13/02/2015
Field of study

A highly structured description of entities and events in histories can support flexible exploration of those histories by users and, ultimately, support richly-linked full-text digital libraries. Here, we apply the Basic Formal Ontology (BFO) to structure a passage about the Roman Constitution from Gibbon's Decline and Fall of the Roman Empire. Specifically, we consider the specification of Roles such as Consuls, Activities associated with those Roles, and Procedures for accomplishing those Activities.Comment: 6 pages, 2 figures, Presented at the Italian Research Conference on Digital Libraries (IRCDL 2015), Bozen-Bolzano, Italy, 29-30 January, 201

arXiv.org e-Print Archive

CiteSeerX

Advances in Automatic Keyphrase Extraction

Author: Basaldella Marco
Publication venue: Universit\ue0 degli Studi di Udine
Publication date: 26/02/2018
Field of study

The main purpose of this thesis is to analyze and propose new improvements in the field of Automatic Keyphrase Extraction, i.e., the field of automatically detecting the key concepts in a document. We will discuss, in particular, supervised machine learning algorithms for keyphrase extraction, by first identifying their shortcomings and then proposing new techniques which exploit contextual information to overcome them. Keyphrase extraction requires that the key concepts, or \emph{keyphrases}, appear verbatim in the body of the document. We will identify the fact that current algorithms do not use contextual information when detecting keyphrases as one of the main shortcomings of supervised keyphrase extraction. Instead, statistical and positional cues, like the frequency of the candidate keyphrase or its first appearance in the document, are mainly used to determine if a phrase appearing in a document is a keyphrase or not. For this reason, we will prove that a supervised keyphrase extraction algorithm, by using only statistical and positional features, is actually able to extract good keyphrases from documents written in languages that it has never seen. The algorithm will be trained over a common dataset for the English language, a purpose-collected dataset for the Arabic language, and evaluated on the Italian, Romanian and Portuguese languages as well. This result is then used as a starting point to develop new algorithms that use contextual information to increase the performance in automatic keyphrase extraction. The first algorithm that we present uses new linguistics features based on anaphora resolution, which is a field of natural language processing that exploits the relations between elements of the discourse as, e.g., pronouns. We evaluate several supervised AKE pipelines based on these features on the well-known SEMEVAL 2010 dataset, and we show that the performance increases when we add such features to a model that employs statistical and positional knowledge only. Finally, we investigate the possibilities offered by the field of Deep Learning, by proposing six different deep neural networks that perform automatic keyphrase extraction. Such networks are based on bidirectional long-short term memory networks, or on convolutional neural networks, or on a combination of both of them, and on a neural language model which creates a vector representation of each word of the document. These networks are able to learn new features using the the whole document when extracting keyphrases, and they have the advantage of not needing a corpus after being trained to extract keyphrases from new documents. We show that with deep learning based architectures we are able to outperform several other keyphrase extraction algorithms, both supervised and not supervised, used in literature and that the best performances are obtained when we build an additional neural representation of the input document and we append it to the neural language model. Both the anaphora-based and the deep-learning based approaches show that using contextual information, the performance in supervised algorithms for automatic keyphrase extraction improves. In fact, in the methods presented in this thesis, the algorithms which obtained the best performance are the ones receiving more contextual information, both about the relations of the potential keyphrase with other parts of the document, as in the anaphora based approach, and in the shape of a neural representation of the input document, as in the deep learning approach. In contrast, the approach of using statistical and positional knowledge only allows the building of language agnostic keyphrase extraction algorithms, at the cost of decreased precision and recall

Archivio istituzionale della ricerca - Università degli Studi di Udine

Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation

Author: De Nart Dario
Publication venue: Università degli Studi di Udine
Publication date: 03/04/2017
Field of study

Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity. Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity. Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions. State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers. To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art. Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering. In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari

Archivio istituzionale della ricerca - Università degli Studi di Udine

Letteratura professionale italiana, a cura di Vittorio Ponzani

Author: Ponzani Vittorio
Publication venue: AIB, Associazione italiana biblioteche
Publication date: 01/07/2021
Field of study

Directory of Open Access Journals

AIB studi (E-Journal - Associazione italiana bibliotech)

Trustworthiness in Social Big Data Incorporating Semantic Analysis, Machine Learning and Distributed Data Processing

Author: Abu Salih Bilal Ahmad Abdal Rahman
Publication venue: Curtin University
Publication date: 01/01/2018
Field of study

This thesis presents several state-of-the-art approaches constructed for the purpose of (i) studying the trustworthiness of users in Online Social Network platforms, (ii) deriving concealed knowledge from their textual content, and (iii) classifying and predicting the domain knowledge of users and their content. The developed approaches are refined through proof-of-concept experiments, several benchmark comparisons, and appropriate and rigorous evaluation metrics to verify and validate their effectiveness and efficiency, and hence, those of the applied frameworks

espace@Curtin

Collaborative Research Practices and Shared Infrastructures for Humanities Computing

Author
Publication venue: Cleup
Publication date: 01/09/2014
Field of study

The volume collect the proceedings of the 2nd Annual Conference of the Italian Association for Digital Humanities (Aiucd 2013), which took place at the Department of Information Engineering of the University of Padua, 11-12 December 2013. The general theme of Aiucd 2013 was “Collaborative Research Practices and Shared Infrastructures for Humanities Computing” so we particularly welcomed submissions on interdisciplinary work and new developments in the field, encouraging proposals relating to the theme of the conference, or more specifically: interdisciplinarity and multidisciplinarity, legal and economic issues, tools and collaborative methodologies, measurement and impact of collaborative methodologies, sharing and collaboration methods and approaches, cultural institutions and collaborative facilities, infrastructures and digital libraries as collaborative environments, data resources and technologies sharing

AMS Acta

Collaborative Research Practices and Shared Infrastructures for Humanities Computing

Author
Publication venue: Cleup
Publication date: 01/09/2014
Field of study

Designing a Library of Components for Textual Scholarship

Author: DEL GROSSO ANGELO MARIO
Publication venue: 'Pisa University Press'
Publication date: 01/05/2015
Field of study

Il presente lavoro affronta e descrive temi legati all'applicazione di nuove tecnologie, di metodologie informatiche e di progettazione software volti allo sviluppo di strumenti innovativi per le Digital Humanities (DH), un’area di studio caratterizzata da una forte interdisciplinarità e da una continua evoluzione. In particolare, questo contributo definisce alcuni specifici requisiti relativi al dominio del Literary Computing e al settore del Digital Textual Scholarship. Conseguentemente, il contesto principale di elaborazione tratta documenti scritti in latino, greco e arabo, nonché testi in lingue moderne contenenti temi storici e filologici. L'attività di ricerca si concentra sulla progettazione di una libreria modulare (TSLib) in grado di operare su fonti ad elevato valore culturale, al fine di editarle, elaborarle, confrontarle, analizzarle, visualizzarle e ricercarle. La tesi si articola in cinque capitoli. Il capitolo 1 riassume il contesto del dominio applicativo e fornisce un quadro generale degli obiettivi e dei benefici della ricerca. Il capitolo 2 illustra alcuni importanti lavori e iniziative analoghe, insieme a una breve panoramica dei risultati più significativi ottenuti nel settore delle DH. Il capitolo 3 ripercorre accuratamente e motiva il processo di progettazione messo a punto. Esso inizia con la descrizione dei principi tecnici adottati e mostra come essi vengono applicati al dominio d'interesse. Il capitolo continua definendo i requisiti, l'architettura e il modello del metodo proposto. Sono così evidenziati e discussi gli aspetti concernenti i design patterns e la progettazione delle Application Programming Interfaces (APIs). La parte finale del lavoro (capitolo 4) illustra i risultati ottenuti da concreti progetti di ricerca che, da un lato, hanno contribuito alla progettazione della libreria e, dall'altro, hanno avuto modo di sfruttarne gli sviluppi. Sono stati quindi discussi diversi temi: (a) l'acquisizione e la codifica del testo, (b) l'allineamento e la gestione delle varianti testuali, (c) le annotazioni multilivello. La tesi si conclude con alcune riflessioni e considerazioni indicando anche possibili percorsi d'indagine futuri (capitolo 5)

Electronic Thesis and Dissertation Archive - Università di Pisa

Atti del IX Convegno Annuale dell'Associazione per l'Informatica Umanistica e la Cultura Digitale (AIUCD). La svolta inevitabile: sfide e prospettive per l'Informatica Umanistica

Author
Publication venue: place:Bologna
Publication date: 01/01/2020
Field of study

Proceedings of the IX edition of the annual AIUCD conferenc

PubliCatt

Atti del IX Convegno Annuale AIUCD. La svolta inevitabile: sfide e prospettive per l'Informatica Umanistica.

Author
Publication venue: Università Cattolica del Sacro Cuore
Publication date: 14/01/2020
Field of study

La nona edizione del convegno annuale dell'Associazione per l'Informatica Umanistica e la Cultura Digitale (AIUCD 2020; Milano, 15-17 gennaio 2020) ha come tema “La svolta inevitabile: sfide e prospettive per l'Informatica Umanistica”, con lo specifico obiettivo di fornire un'occasione per riflettere sulle conseguenze della crescente diffusione dell’approccio computazionale al trattamento dei dati connessi all’ambito umanistico. Questo volume raccoglie gli articoli i cui contenuti sono stati presentati al convegno. A diversa stregua, essi affrontano il tema proposto da un punto di vista ora più teorico-metodologico, ora più empirico-pratico, presentando i risultati di lavori e progetti (conclusi o in corso) che considerino centrale il trattamento computazionale dei dati

AMS Acta