27 research outputs found
A Multilingual Study of Compressive Cross-Language Text Summarization
Cross-Language Text Summarization (CLTS) generates summaries in a language
different from the language of the source documents. Recent methods use
information from both languages to generate summaries with the most informative
sentences. However, these methods have performance that can vary according to
languages, which can reduce the quality of summaries. In this paper, we propose
a compressive framework to generate cross-language summaries. In order to
analyze performance and especially stability, we tested our system and
extractive baselines on a dataset available in four languages (English, French,
Portuguese, and Spanish) to generate English and French summaries. An automatic
evaluation showed that our method outperformed extractive state-of-art CLTS
methods with better and more stable ROUGE scores for all languages
Use of Mass Spectrometry for the Determination of Formaldehyde in Samples Potentially Toxic to Humans: A Brief Review
The chemical characteristics of formaldehyde make it widely used and important in the global economy. It has applications in the health area and in various industrial sectors. However, formaldehyde is considered toxic substance and is classifed as a persistent organic pollutant. Direct and prolonged contact with formaldehyde can cause serious damage to the body and may even lead to death. It is classifed by several agencies as a human carcinogen and may exhibit mutagenic/teratogenic efects and/or damage the endocrine system. Various matrices have been found to contain formaldehyde at concentrations higher than those permited by global health regulatory agencies. To this end, mass spectrometry can provide a very useful tool, enabling the identifcation and quantifcation of formaldehyde. Although various analytical techniques can be used for the determination and quantifcation of volatile organic compounds, chromatography is one of the most widely used methods due to its precision. Coupled to a detection system such as mass spectrometry, it can be employed for the determination of compounds potentially toxic to humans, including formaldehyde. The purpose of this chapter is to summarize some recent and important studies concerning the quantifcation of formaldehyde using mass spectrometry as a powerful analytical tool
Overview of CLEF HIPE 2020: Named Entity Recognition and Linking on Historical Newspapers
This paper presents an overview of the first edition of HIPE (Identifying Historical People, Places and other Entities), a pioneering shared task dedicated to the evaluation of named entity processing on historical newspapers in French, German and English. Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. In this context, the objective of HIPE, run as part of the CLEF 2020 conference, is threefold: strengthening the robustness of existing approaches on non-standard inputs, enabling performance comparison of NE processing on historical texts, and, in the long run, fostering efficient semantic indexing of historical documents. Tasks, corpora, and results of 13 participating teams are presented
Apoios à mulher/nutriz nas peças publicitárias da Semana Mundial da Amamentação
RESUMO Objetivo: desvendar os apoios da rede social da mulher/nutriz nas peças publicitárias da Semana Mundial da Amamentação. Método: estudo descritivo, exploratório, documental, qualitativo. Desde a coleta até a análise dessas peças, foram adotados os passos metodológicos de Gemma Penn, fundamentados na semiologia de Roland Barthes. Os resultados foram interpretados pela teoria da Rede Social de SanÃcola e dos cinco tipos de apoio: presencial, emocional, instrumental, informativo e autoapoio. Resultados: em nove peças publicitárias das 22 semanas mundiais da amamentação, identificou-se/identificaram-se ator(es) da rede social da mulher/nutriz. Em cinco delas, companheiro, avó e irmão demonstraram apoio emocional e presencial à amamentação. Percebeu-se o autoapoio em três cartazes; o apoio instrumental, em um cartaz; e o apoio informativo, em cartaz algum. Conclusão: os apoios desvendados em apenas cinco peças publicitárias incluÃram: emocional, presencial, instrumental e autoapoio. Nas demais, não havia apoio. Em nenhuma delas, o conjunto dos apoios foi revelado
Entity Linking for Historical Documents: Challenges and Solutions
International audienceNamed entities (NEs) are among the most relevant type of information that can be used to efficiently index and retrieve digital documents. Furthermore, the use of Entity Linking (EL) to disambiguate and relate NEs to knowledge bases, provides supplementary information which can be useful to differentiate ambiguous elements such as geographical locations and peoples' names. In historical documents, the detection and disambiguation of NEs is a challenge. Most historical documents are converted into plain text using an optical character recognition (OCR) system at the expense of some noise. Documents in digital libraries will, therefore, be indexed with errors that may hinder their accessibility. OCR errors affect not only document indexing but the detection, disambiguation, and linking of NEs. This paper aims at analysing the performance of different EL approaches on two multilingual historical corpora, CLEF HIPE 2020 (English, French, German) and NewsEye (Finnish, French, German, Swedish), while proposes several techniques for alleviating the impact of historical data problems on the EL task. Our findings indicate that the proposed approaches not only outperform the baseline in both corpora but additionally they considerably reduce the impact of historical document issues on different subjects and languages