Search CORE

20 research outputs found

Using Crowdsourcing for Fine-Grained Entity Type Completion in Knowledge Bases

Author: A Gangemi
A Melo
A Palmero Aprosio
AP Dawid
B Mozafari
H Paulheim
H Paulheim
H Paulheim
J Lehmann
T Rebele
Z Dong
Publication venue: Springer International Publishing
Publication date: 01/01/2018
Field of study

Recent years have witnessed the proliferation of large-scale Knowledge Bases (KBs). However, many entities in KBs have incomplete type information, and some are totally untyped. Even worse, fine-grained types (e.g., BasketballPlayer) containing rich semantic meanings are more likely to be incomplete, as they are more difficult to be obtained. Existing machine-based algorithms use predicates (e.g., birthPlace) of entities to infer their missing types, and they have limitations that the predicates may be insufficient to infer fine-grained types. In this paper, we utilize crowdsourcing to solve the problem, and address the challenge of controlling crowdsourcing cost. To this end, we propose a hybrid machine-crowdsourcing approach for fine-grained entity type completion. It firstly determines the types of some “representative” entities via crowdsourcing and then infers the types for remaining entities based on the crowdsourcing results. To support this approach, we first propose an embedding-based influence for type inference which considers not only the distance between entity embeddings but also the distances between entity and type embeddings. Second, we propose a new difficulty model for entity selection which can better capture the uncertainty of the machine algorithm when identifying the entity types. We demonstrate the effectiveness of our approach through experiments on real crowdsourcing platforms. The results show that our method outperforms the state-of-the-art algorithms by improving the effectiveness of fine-grained type completion at affordable crowdsourcing cost.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Data-Driven RDF Property Semantic-Equivalence Detection Using NLP Techniques

Author: A Palmero Aprosio
D Rinser
E Rahm
J Lehmann
M Saleem
N Mihindukulasooriya
P Shvaiko
S Auer
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2016
Field of study

DBpedia extracts most of its data from Wikipedia’s infoboxes. Manually-created “mappings” link infobox attributes to DBpedia ontology properties (dbo properties) producing most used DBpedia triples. However, infoxbox attributes without a mapping produce triples with properties in a different namespace (dbp properties). In this position paper we point out that (a) the number of triples containing dbp properties is significant compared to triples containing dbo properties for the DBpedia instances analyzed, (b) the SPARQL queries made by users barely use both dbp and dbo properties simultaneously, (c) as an exploitation example we show a method to automatically enhance SPARQL queries by using syntactic and semantic similarities between dbo properties and dbp properties

Crossref

Archivo Digital UPM

Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018

Author: Abramova Ekaterina
Adorni Giovanni
Agrawal Ruchit
Aina Laura
Albanese Teresa
Albanesi Davide
Alzetta Chiara
Amore Matteo
Antonelli Oronzo
Aprosio Alessio Palmero
Balaraman Vevake
Basile Pierpaolo
Basile Valerio
Basili Roberto
Bassignana Elisa
Bellandi Andrea
Bentivogli Luisa
Bernardi Raffaella
Bertoldi Nicola
Bondielli Alessandro
Bos Johan
Bosco Cristina
Bottini Roberto
Brunato Dominique
Brunato⋄ Dominique
Buono Maria Pia di
Busso Lucia
Büchler Marco
Cabrio Elena
Caruso Valeria
Caselli Tommaso
Cecchini Flavio
Celli Fabio
Cervone Alessandra
Chesi Cristiano
Chingacham Anupama
Chiriatti Giulia
Cimino Andrea
Cocciu• Eleonora
Colla Davide
Comandini Gloria
Cordeiro Silvio Ricardo
Crepaldi Davide
Croce Danilo
Curtoni Paolo
Cutugno Francesco
dell’Oglio Pietro
Dell’Orletta Felice
Dell’Orletta⋄ Felice
De Felice Irene
De Martino Maria
Dini Luca
Di Iorio Angelo
Di Nunzio Giorgio Maria
Draetta Lia
Ducceschi Luca
Elia Annibale
Falavigna Daniele
Federico Marcello
Feltracco Anna
Fernández Raquel
Ferro Michele
Fieromonte Martina
Franzini Greta
Gagliardi Gloria
Gala Valentina Della
Gambi Enrico
Ghezzi Ilaria
Giovannetti Emiliano
Gobbi Jacopo
Gretter Roberto
Guarasci Raffaele
Guerini Marco
Gurevych Iryna
Günther Fritz
Herzog Leonardo
Jezek Elisabetta
Koceva Forsina
Lai Mirko
Laudanna Alessandro
Lenci Alessandro
Lepri Bruno
Liano Annarita
Limpens Freddy
Louvan Samuel
Lyding Verena
Magnini Bernardo
Magnolini Simone
Mairano Paolo
Mambrini Francesco
Mana Dario
Mancuso Azzurra
Marchi Simone
Marelli Marco
Marini Costanza
Mazzei Alessandro
McGregor Stephen
Melnikova Elena
Menini Stefano
Mensa Enrico
Merenda Flavio
Mollo Eleonora
Montemagni Simonetta
Montemagni⋄ Simonetta
Monti Johanna
Moretti Giovanni
Moritz Maria
Nadalini Andrea
Negri Matteo
Nicolas Lionel
Nissim Malvina
Novielli Nicole
Okinina Nadezda
Pannitto Ludovica
Paperno Denis
Passalacqua Samuele
Passaro Lucia C.
Passarotti Marco
Patti Viviana
Pecchioli Alessandra
Pellegrini Matteo
Petrolito Ruggero
Pettenati Maria Chiara
Piantanida Giovanni
Poggi Isabella
Porporato Aureliano
Quinci Vito
Radicioni Daniele P.
Ramisch Carlos
Rapp Amon
Riccardi Giuseppe
Rossini Daniele
Rotondi Agata
Ruffolo Paolo
Russo Irene
Sagri Maria Teresa
Sangati Federico
Sanguinetti Manuela
Savary Agata
Savy Renata
Simeoni Rossana
Simi Maria
Sorgente Antonio
Speranza Manuela
Sprugnoli Rachele
Stede Manfred
Stepanov Evgeny A.
Stingo Michele
Tamburini Fabio
Tebbifakhr Amirhossein
Tonelli Sara
Torre Ilaria
Tortoreto Giuliano
Totis Pietro
Trotta Daniela
Turchi Marco
Valeriani Martina
Venturi Giulia
Venturi⋄ Giulia
Vezzani Federica
Villata Serena
Vincze Veronika
Zaghi Claudia
Zovato Enrico
Publication venue: 'OpenEdition'
Publication date: 08/04/2019
Field of study

On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

OpenEdition

Extending Linked Open Data resources exploiting Wikipedia as source of information

Author: Palmero Aprosio A.
Publication venue: Università degli Studi di Milano
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler

Natural language interaction with the web of data by mining its textual side

Author: Cabrio E.
Cojan J.
Gandon F.
Palmero Aprosio A.
Publication venue
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler

QAKiS@ QALD-2

Author: Cabrio E.
Cojan J.
Gandon F.
Lavelli A.
Magnini B.
Palmero Aprosio A.
Publication venue
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler

Neural Text Simplification in Low-Resource Conditions Using Weak Supervision

Author: Di Gangi Mattia A.
Negri Matteo
Palmero Aprosio Alessio
Tonelli Sara
Turchi Marco
Publication venue: Association for Computational Linguistics (ACL)
Publication date
Field of study

Neural text simplification has gained increasing attention in the NLP community thanksto recent advancements in deep sequence-to-sequence learning. Most recent efforts withsuch a data-demanding paradigm have dealtwith the English language, for which sizeabletraining datasets are currently available to deploy competitive models. Similar improvements on less resource-rich languages are conditioned either to intensive manual work tocreate training data, or to the design of effective automatic generation techniques to bypass the data acquisition bottleneck. Inspiredby the machine translation field, in which synthetic parallel pairs generated from monolingual data yield significant improvements toneural models, in this paper we exploit largeamounts of heterogeneous data to automatically select simple sentences, which are thenused to create synthetic simplification pairs.We also evaluate other solutions, such as over-sampling and the use of external word embeddings to be fed to the neural simplificationsystem. Our approach is evaluated on Italianand Spanish, for which few thousand gold sentence pairs are available. The results show thatthese techniques yield performance improvements over a baseline sequence-to-sequenceconfiguration

Archivio della ricerca - Fondazione Bruno Kessler

PreMOn: LODifing linguistic predicate models

Author: A Gangemi
Alessio Palmero Aprosio
B Levin
C Chiarcos
C Fellbaum
F Corcoglioniti
Francesco Corcoglioniti
J Eckle-Kohler
JP McCrae
M Palmer
M Rospocher
Marco Rospocher
T Parsons
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

From Conditional Random Field (CRF) to Rhetorical Structure Theory (RST): incorporating context information in sentiment analysis

Author: A Gangemi
A Hogenboom
A Palmero Aprosio
A Rexha
A Rexha
B Pang
D Bal
DR Recupero
G Katz
J Wiebe
J Wiebe
M Federici
O Appel
O Täckström
PJ Stone
R Feldman
S Deerwester
WC Mann
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2017
Field of study

This paper investigates a method based on Conditional Random Fields (CRFs) to incorporate sentence structure (syntax and semantics) and context information to identify sentiments of sentences. It also demonstrates the usefulness of the Rhetorical Structure Theory (RST) taking into consideration the discourse role of text segments. Thus, this paper’s aim is to reconsider the effectiveness of CRF and RST methods in incorporating the contextual information into Sentiment Analysis systems. Both methods are evaluated on two, different in size and genre of information sources, the Movie Review Dataset and the Finegrained Sentiment Dataset (FSD). Finally, we discuss the lessons learned from these experimental settings w.r.t. addressing the following key research questions such as whether there is an appropriate type of social media repository to incorporate contextual information, whether extending the pool of the selected features could improve context incorporation into SA systems and which is the best performing feature combination to achieve such improved performance

University of Lincoln Institutional Repository

Crossref

Opinion Mining with a Clause-Based Approach

Author: A Palmero Aprosio
A Rexha
A Rexha
B Liu
C Costa Pereira da
E Cambria
G Petrucci
G Petrucci
G Qiu
JK-C Chung
K Schouten
M Dragoni
M Dragoni
M Dragoni
M Dragoni
M Dragoni
M Dragoni
M Federici
M Taboada
QF Wang
S Huang
T Wilson
Publication venue: Springer
Publication date: 01/01/2017
Field of study

With different social media and commercial platforms, users express their opinion about products in a textual form. Automatically extracting the polarity (i.e. whether the opinion is positive or negative) of a user can be useful for both actors: the online platform incorporating the feedback to improve their product as well as the client who might get recommendations according to his or her preferences. Different approaches for tackling the problem, have been suggested mainly using syntactic features. The “Challenge on Semantic Sentiment Analysis” aims to go beyond the word-level analysis by using semantic information. In this paper we propose a novel approach by employing the semantic information of grammatical unit called preposition. We try to derive the target of the review from the summary information, which serves as an input to identify the proposition in it. Our implementation relies on the hypothesis that the proposition expressing the target of the summary, usually containing the main polarity information

Crossref

Archivio della ricerca - Fondazione Bruno Kessler