Search CORE

293 research outputs found

Predicting dialect variation in immigrant contexts using light verb constructions

Author: Doğruöz A. Seza
Nakov Preslav
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

Languages spoken by immigrants change due to contact with the local languages. Capturing these changes is problematic for current language technologies, which are typically developed for speakers of the standard dialect only. Even when dialec-tal variants are available for such technolo-gies, we still need to predict which di-alect is being used. In this study, we dis-tinguish between the immigrant and the standard dialect of Turkish by focusing on Light Verb Constructions. We experiment with a number of grammatical and contex-tual features, achieving over 84 % accuracy (56 % baseline).

CiteSeerX

Crossref

Ghent University Academic Bibliography

Effect of grape pomace powder addition on chemical, nutritional and technological properties of cakes

Author: A. Brandolini
A. Hidalgo
G. Nakov
I. Dimov.
N. Ivanova
V. Stamatovska
Publication venue: 'Elsevier BV'
Publication date: 01/12/2020
Field of study

Aim of the research was to study the influence of grape (Vitis vinifera) pomace powder, a by-product of wine manufacturing, on chemical composition, nutritional properties and physical characteristics of cakes prepared replacing bread wheat flour with 4%, 6%, 8% and 10% grape pomace powder. The addition of growing quantities of grape pomace powder gradually increased ash, lipid, proteins, fibres, free phenolics, anthocyanins and total polyphenol content as well as antioxidant capacity (DPPH, FRAP), while decreased moisture and \u440\u41d. The main phenolics provided by grape pomace were catechin, gallic acid, quercitin, protocatechuic acid, kaempferol and apigenin. The phenolic acids and flavonoids content increased from 4.1\u202fmg/kg DM (control) to 26.4\u201360.9\u202fmg/kg DM (cake with 4%\u201310% grape pomace powder). The colour coordinates L* and a* diminished, while b* augmented. The cake containing 4% grape pomace powder showed the best sensory quality. The addition of grape pomace powder significantly improved the content in free phenolics, highly bioavailable, that are scarce in bread wheat, and thus the nutritional value of cakes without penalising their technological and sensorial attributes. Therefore, grape pomace powder utilisation will give foods with nutritionally enhanced properties; additionally, its utilisation will alleviate the ecological problems connected to its disposal

AIR Universita degli studi di Milano

Overview of the CLEF-2019 Checkthat! LAB: Automatic identification and verification of claims. Task 2: Evidence and factuality

Author: Barron-Cedeno A.
Elsayed T.
Hasanain M.
Nakov P.
Suwaileh R.
Publication venue: CEUR-WS
Publication date: 01/01/2019
Field of study

We present an overview of Task 2 of the second edition of the CheckThat! Lab at CLEF 2019. Task 2 asked (A) to rank a given set of Web pages with respect to a check-worthy claim based on their usefulness for fact-checking that claim, (B) to classify these same Web pages according to their degree of usefulness for fact-checking the target claim, (C) to identify useful passages from these pages, and (D) to use the useful pages to predict the claim's factuality. Task 2 at CheckThat! provided a full evaluation framework, consisting of data in Arabic (gathered and annotated from scratch) and evaluation based on normalized discounted cumulative gain (nDCG) for ranking, and F1 for classification. Four teams submitted runs. The most successful approach to subtask A used learning-to-rank, while different classifiers were used in the other subtasks. We release to the research community all datasets from the lab as well as the evaluation scripts, which should enable further research in the important task of evidence-based automatic claim verification

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Dense vs. Sparse representations for news stream clustering

Author: Barron-Cedeno A.
Da San Martino G.
Nakov P.
Staykovski T.
Publication venue: CEUR-WS
Publication date: 01/01/2019
Field of study

The abundance of news being generated on a daily basis has made it hard, if not impossible, to monitor all news developments. Thus, there is an increasing need for accurate tools that can organize the news for easier exploration. Typically, this means clustering the news stream, and then connecting the clusters into story lines. Here, we focus on the clustering step, using a local topic graph and a community detection algorithm. Traditionally, news clustering was done using sparse vector representations with TF\u2013IDF weighting, but more recently dense representations have emerged as a popular alternative. Here, we compare these two representations, as well as combinations thereof. The evaluation results on a standard dataset show a sizeable improvement over the state of the art both for the standard F1 as well as for a BCubed version thereof, which we argue is more suitable for the task

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Influence of apple peel powder addition on the physico-chemical characteristics and nutritional quality of bread wheat cookies

Author: A. Brandolini
A. Hidalgo
D.K. Komleni&#263
G. Nakov
J. Lukinac
M. Juki&#263
N. Ivanova
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Apple peel, a food industry by-product, is rich in fibre, polyphenols and minerals, and is a potentially attractive ingredient for bakery products. To evaluate the effect of wheat cookies enrichment with apple peel powder six types of cookies with increasing apple peel powder percentage (0%, 4%, 8%, 16%, 24% and 32%) were produced. The traits analysed were: pasting parameters; chemical properties (moisture, ash, lipid, protein, fibre and total polyphenols content); antioxidant capacity (2,2-diphenyl-1-picrylhydrazyl and ferric reducing antioxidant power methods); physical attributes (width, thickness, volume and CIE lab colour); and sensory characteristics (external appearance, internal structure, texture, odour, taste and aroma). Statistical analysis included analysis of variance followed by Fisher\u2019s least significant difference test (p<0.05). The apple peel powder-enriched cookies had significantly higher moisture, ash, lipid, fibre, total polyphenols and antioxidant capacity than the control bread wheat cookies. The addition of apple peel powder did not modify the physical characteristics and improved the sensorial quality of the products. The addition of 24% apple peel powder gave the cookies with the best overall quality

AIR Universita degli studi di Milano

Prta: A System to Support the Analysis of Propaganda Techniques in the News

Author: Barron-Cedeno A
Da San Martino G
Nakov P
Shaar S
Yu SH
Zhang YF
Publication venue: ASSOC COMPUTATIONAL LINGUISTICS-ACL
Publication date: 01/01/2020
Field of study

Recent events, such as the 2016 US Presidential Campaign, Brexit and the COVID-19 "infodemic", have brought into the spotlight the dangers of online disinformation. There has been a lot of research focusing on fact-checking and disinformation detection. However, little attention has been paid to the specific rhetorical and psychological techniques used to convey propaganda messages. Revealing the use of such techniques can help promote media literacy and critical thinking, and eventually contribute to limiting the impact of "fake news" and disinformation campaigns.Prta (Propaganda Persuasion Techniques Analyzer) allows users to explore the articles crawled on a regular basis by highlighting the spans in which propaganda techniques occur and to compare them on the basis of their use of propaganda techniques. The system further reports statistics about the use of such techniques, overall and over time, or according to filtering criteria specified by the user based on time interval, keywords, and/or political orientation of the media. Moreover, it allows users to analyze any text or URL through a dedicated interface or via an API. The system is available online: https://www.tanbih.org/prta

Archivio istituzionale della ricerca - Università di Padova

Thread-level information for comment classification in community question answering

Author: Barron-Cedeno A.
Da San Martino G.
Filice S.
Joty S.
Marquez L.
Moschitti A.
Nakov P.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Community Question Answering (cQA) is a new application of QA in social contexts (e.g., fora). It presents new interesting challenges and research directions, e.g., exploiting the dependencies between the different comments of a thread to select the best answer for a given question. In this paper, we explored two ways of modeling such dependencies: (i) by designing specific features looking globally at the thread; and (ii) by applying structure prediction models. We trained and evaluated our models on data from SemEval-2015 Task 3 on Answer Selection in cQA. Our experiments show that: (i) the thread-level features consistently improve the performance for a variety of machine learning models, yielding state-of-the-art results; and (ii) sequential dependencies between the answer labels captured by structured prediction models are not enough to improve the results, indicating that more information is needed in the joint model

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Do Peers See More in a Paper Than Its Authors?

Author: Anna Divoli
Marti A. Hearst
Preslav Nakov
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2012
Field of study

Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances—sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality

Author: Atanasova P.
Barron-Cedeno A.
Da San Martino G.
Elsayed T.
Kyuchukov S.
Marquez L.
Nakov P.
Suwaileh R.
Zaghouani W.
Publication venue: CEUR-WS
Publication date: 01/01/2018
Field of study

We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 2: Factuality. The task asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. In terms of data, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and five of them actually submitted runs. The most successful approaches used by the participants relied on the automatic retrieval of evidence from the Web. Similarities and other relationships between the claim and the retrieved documents were used as input to classifiers in order to make a decision. The best-performing official submissions achieved mean absolute error of .705 and .658 for the English and for the Arabic test sets, respectively. This leaves plenty of room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in fact-checking

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna