Search CORE

52 research outputs found

Advancements in eHealth Data Analytics through Natural Language Processing and Deep Learning

Author: Apostol Elena-Simona
Truică Ciprian-Octavian
Publication venue
Publication date: 19/01/2024
Field of study

The healthcare environment is commonly referred to as "information-rich" but also "knowledge poor". Healthcare systems collect huge amounts of data from various sources: lab reports, medical letters, logs of medical tools or programs, medical prescriptions, etc. These massive sets of data can provide great knowledge and information that can improve the medical services, and overall the healthcare domain, such as disease prediction by analyzing the patient's symptoms or disease prevention, by facilitating the discovery of behavioral factors for diseases. Unfortunately, only a relatively small volume of the textual eHealth data is processed and interpreted, an important factor being the difficulty in efficiently performing Big Data operations. In the medical field, detecting domain-specific multi-word terms is a crucial task as they can define an entire concept with a few words. A term can be defined as a linguistic structure or a concept, and it is composed of one or more words with a specific meaning to a domain. All the terms of a domain create its terminology. This chapter offers a critical study of the current, most performant solutions for analyzing unstructured (image and textual) eHealth data. This study also provides a comparison of the current Natural Language Processing and Deep Learning techniques in the eHealth context. Finally, we examine and discuss some of the current issues, and we define a set of research directions in this area

arXiv.org e-Print Archive

Modelling frequency, attestation, and corpus-based information with OntoLex-FrAC

Author: Apostol Elena-Simona
Chiarcos Christian
Kabashi Besim
Truică Ciprian-Octavian
Publication venue
Publication date: 20/04/2023
Field of study

OntoLex-Lemon has become a de facto standard for lexical resources in the web of data. This paper provides the first overall description of the emerging OntoLex module for Frequency, Attestations, and Corpus-Based Information (OntoLex-FrAC) that is intended to complement OntoLex-Lemon with the necessary vocabulary to represent major types of information found in or automatically derived from corpora, for applications in both language technology and the language sciences

OPUS Augsburg

The Forgotten Document-Oriented Database Management Systems: An Overview and Benchmark of Native XML DODBMSes in Comparison with JSON DODBMSes

Author: Apostol Elena-Simona
Darmont Jérôme
Pedersen Torben Bach
Truică Ciprian-Octavian
Publication venue: 'Elsevier BV'
Publication date: 03/02/2021
Field of study

In the current context of Big Data, a multitude of new NoSQL solutions for storing, managing, and extracting information and patterns from semi-structured data have been proposed and implemented. These solutions were developed to relieve the issue of rigid data structures present in relational databases, by introducing semi-structured and flexible schema design. As current data generated by different sources and devices, especially from IoT sensors and actuators, use either XML or JSON format, depending on the application, database technologies that store and query semi-structured data in XML format are needed. Thus, Native XML Databases, which were initially designed to manipulate XML data using standardized querying languages, i.e., XQuery and XPath, were rebranded as NoSQL Document-Oriented Databases Systems. Currently, the majority of these solutions have been replaced with the more modern JSON based Database Management Systems. However, we believe that XML-based solutions can still deliver performance in executing complex queries on heterogeneous collections. Unfortunately nowadays, research lacks a clear comparison of the scalability and performance for database technologies that store and query documents in XML versus the more modern JSON format. Moreover, to the best of our knowledge, there are no Big Data-compliant benchmarks for such database technologies. In this paper, we present a comparison for selected Document-Oriented Database Systems that either use the XML format to encode documents, i.e., BaseX, eXist-db, and Sedna, or the JSON format, i.e., MongoDB, CouchDB, and Couchbase. To underline the performance differences we also propose a benchmark that uses a heterogeneous complex schema on a large DBLP corpus.Comment: 28 pages, 6 figures, 7 table

arXiv.org e-Print Archive

Post-traumatic humerus non-union treatment using fibular bone graft in a pediatric patient – case report

Author: Apostol D.
Aprodu S. G.
Gavrilescu Simona
Grosu Raluca-Ioana
Hanganu Elena
Publication venue: Asociaţia chirurgilor “Nicolae Anestiadi” din Republica Moldova
Publication date: 01/01/2019
Field of study

Clinica de Chirurgie și Ortopedie Pediatrică, Spitalul Clinic de Urgență pentru Copii ”Sfânta Maria”, Iași, România, Al XIII-lea Congres al Asociației Chirurgilor „Nicolae Anestiadi” și al III-lea Congres al Societății de Endoscopie, Chirurgie miniminvazivă și Ultrasonografie ”V.M.Guțu” din Republica MoldovaIntroducere: Pseudartroza de humerus prezintă o incidență de 8-12% în rândul populației pediatrice. Există o varietate largă de opțiuni terapeutice, principiile de tratament bazându-se pe reducere deschisă și osteosinteză sau utilizare de grefon osos, în funcție de caz. Material și metode: Pacientă în vârstă de 15 ani,victimă a unui accident rutier, s-a internat în urgență pentru traumatisme multiple, printre diagnosticele stabilite fiind fractura diafizei humerale stângi, pentru care s-a practicat reducere deschisă și osteosinteză cu placă și șuruburi. La un an postoperator s-a constatat pseudartroză la nivelul focarului de fractură. Rezultate obținute: S-au utilizat numeroase metode de tratament pentru cura pseudartrozei: montarea de fixator extern, injectare perilezională cu factori de creștere, o nouă osteosinteză cu placă și șuruburi și plombajul defectului osos cu ceramică bifazică. Întrucât pseudartroza a persistat, s-a optat pentru o nouă intervenție cu utilizarea grefonului osos peronier fixat centromedular la nivelul humerusului, evoluția postoperatorie fiind favorabilă. Concluzie: Pseudartroza de humerus rămâne una dintre cele mai dificile complicații ale fracturii de humerus din cauza frecvenței sale și a dificultăților de management terapeutic. În cazul de față, tratamentul chirurgical utilizând grefon osos autolog a avut rezultate optime, cu o rată bună de vindecare din punct de vedere anatomo-funcțional.Introduction: Humerus fracture non-union presents 8-12% of all the pediatric population. Treatment options are numerus and is generally based on open reduction with internal fixation or using bone graft depending on each case individually. Material and methods: 15 years old female patient, presented with road traffic crash, was admitted with multiple trauma. Left humerus shaft fracture was one of the established diagnosis, open reduction with internal fixation was performed using a plate and screws. 1 year postoperatively non-union was noticed on the check X-rays at the fracture site. Results: Multiple methods were used for non-union treatment: external fixator, growth factor hormone injection at fracture site, second open reduction with internal fixation attempt and biomaterial usage for bone loss. Fracture non-union persisted despite the usage of the mentioned methods. Decision was taken to use fibular bone graft as an intramedullary fixation of the humerus. Postoperative result and follow up were satisfactory. Conclusion: Humeral fracture non-union is one of the worse complications due to its frequency and difficult therapeutic management. In this presented case, surgical treatment using autologous bone graft reflected in optimal results, high anatomo-functional healing results

Institutional Repository in Medical Sciences of Nicolae Testemitanu State University of Medicine and Pharmacy of the Republic of Moldova

Izrada OWL ontologije za prikaz, povezivanje i pretraživanje SemAF diskursnih oznaka

Author: Apostol Elena-Simona
Bączkowska Anna
Chiarcos Christian
Damova Mariana
Liebeskind Chaya
Silvano Purificação
Trajanov Dimitar
Truică Ciprian-Octavian
Valunaite Oleškeviciene Giedre
Publication venue: Institute of Croatian Language and Linguistics
Publication date: 01/01/2023
Field of study

Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale. The main objective of this paper is to demonstrate how LLOD technologies can be applied to represent and annotate a corpus composed of multiword discourse markers, and what the effects of this are. In particular, it is our aim to apply semantic web standards such as RDF and OWL for publishing and integrating data. We present a novel scheme for discourse annotation that combines ISO standards describing discourse relations and dialogue acts – ISO DR-Core (ISO 24617-8) and ISO-Dialogue Acts (ISO 24617-2) in 9 languages (cf. Silvano and Damova 2022; Silvano, et al. 2022). We develop an OWL ontology to formalize that scheme, provide a newly annotated dataset and link its RDF edition with the ontology. Consequently, we describe the conjoint querying of the ontology and the annotations by means of SPARQL, the standard query language for the web of data. The ultimate result is that we are able to perform queries over multiple, interlinked datasets with complex internal structure. This is a first, but essential step, in developing novel, powerful, and groundbreaking means for the corpus-based study of multilingual discourse, communication analysis, or attitudes discovery.Diskursni markeri jezični su znakovi koji pokazuju kako se iskaz odnosi na kontekst diskursa i koju ulogu ima u razgovoru. Lingvistički povezani otvoreni podatci (LLOD) tehnologije su u nastajanju koje omogućuju snažan instrument za prikaz i tumačenje jezičnih fenomena na razini weba. Glavni je cilj ovoga rada pokazati kako se tehnologije lingvistički povezanih otvorenih podataka (LLOD) mogu primijeniti za prikaz i označavanje korpusa višerječnih diskursnih markera te koji su učinci toga. Konkretno, naš je cilj primijeniti standarde semantičkoga weba kao što su RDF i Web Ontology Language (OWL) za objavljivanje i integraciju podataka. Autori predstavljaju novu shemu za označavanje diskursa koja kombinira ISO standarde za opis diskursnih odnosa i dijaloških činova – ISO DR-Core (ISO 24617-8) i ISO-Dialogue Acts (ISO 24617-2) na devet jezika (usp. Silvano, Purificação et al. 2022a; Silvano, Purificação et al. 2022b). Razvijamo OWL ontologiju kako bismo formalizirali tu shemu, pružili nov označeni skup podataka i povezali njegovu RDF inačicu s ontologijom. U skladu s tim opisujemo zajedničko postavljanje upita ontologiji i oznakama s pomoću SPARQL-a, standardnoga jezika upita za web podataka. Konačni je rezultat taj da možemo izvršiti upite nad višestrukim, međusobno povezanim skupovima podataka sa složenom unutarnjom strukturom bez potrebe za ikakvim specijaliziranim softverom. Umjesto toga upotrebljavaju se gotove tehnologije utemeljene na web standardima koje se bez napora mogu prenijeti na različite operativne sustave, baze podataka i programske jezike. Ovo je prvi, ali prijeloman korak u razvoju novih, snažnih i (u određenom trenutku) pristupačnih sredstava za korpusno utemeljena istraživanja višejezičnoga diskursa te za analizu komunikacije i otkrivanje stavova

OPUS Augsburg

Publikationer från Uppsala Universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Repositório Aberto da Universidade do Porto

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

An OWL ontology for ISO-based discourse marker annotation

Author: Apostol Elena-Simona
Baczkowska Anna
Chiarcos Christian
Damova Mariana
Liebeskind Chaya
Silvano Maria da Purificação
Trajanov Dimitar
Truica Ciprian-Octavian
Valunaite-Oleskeviciene Giedre
Publication venue
Publication date: 01/01/2022
Field of study

Purpose: Discourse markers are linguistic cues that indicate how an utterance relates to the discourse context and what role it plays in conversation. The authors are preparing an annotated corpus in nine languages, and specifically aim to explore the role of Linguistic Linked Open Data (/LLOD) technologies in the process, i.e., the application of web standards such as RDF and the Web Ontology Language (OWL) for publishing and integrating data. We demonstrate the advantages of this approach

Repositório Aberto da Universidade do Porto

Mykolas Romeris University Institutional Repository

ISO-based annotated multilingual parallel corpus for discourse markers

Author: Apostol Elena-Simona
Baczkowska Anna
Chiarcos Christian
Ciprian-Octavian Truica
Damova Mariana
Liebeskind Chaya
Silvano Maria da Purificação
Trajanov Dimitar
Valunaité Giedré Oleskeviciené
Publication venue
Publication date: 01/01/2022
Field of study

Discourse markers carry information about the discourse structure and organization, and also signal local dependencies or epistemological stance of speaker. They provide instructions on how to interpret the discourse, and their study is paramount to understand the mechanism underlying discourse organization. This paper presents a new language resource, an ISO-based annotated multilingual parallel corpus for discourse markers. The corpus comprises nine languages, Bulgarian, Lithuanian, German, European Portuguese, Hebrew, Romanian, Polish, and Macedonian, with English as a pivot language. In order to represent the meaning of the discourse markers, we propose an annotation scheme of discourse relations from ISO 24617-8 with a plug-in to ISO 24617-2 for communicative functions. We describe an experiment in which we applied the annotation scheme to assess its validity. The results reveal that, although some extensions are required to cover all the multilingual data, it provides a proper representation of discourse markers value. Additionally, we report some relevant contrastive phenomena concerning discourse markers interpretation and role in discourse. This first step will allow us to develop deep learning methods to identify and extract discourse relations and communicative functions, and to represent that information as Linguistic Linked Open Data (LLOD)

Repositório Aberto da Universidade do Porto

Validation of language agnostic models for discourse marker detection

Author: Apostol Elena-Simona
Baczkowska Anna
Chiarcos Christian
Damova Mariana
Liebeskind Chaya
Mishev Kostadin
Oleskeviciene Giedre Valunaite
Silvano Maria da Purificação
Trajanov Dimitar
Truica Ciprian-Octavian
Publication venue
Publication date: 01/01/2023
Field of study

Using language models to detect or predict the presence of language phenomena in the text has become a mainstream research topic. With the rise of generative models, experiments using deep learning and transformer models trigger intense interest. Aspects like precision of predictions, portability to other languages or phenomena, scale have been central to the research community. Discourse markers, as language phenomena, perform important functions, such as signposting, signalling, and rephrasing, by facilitating discourse organization. Our paper is about discourse markers detection, a complex task as it pertains to a language phenomenon manifested by expressions that can occur as content words in some contexts and as discourse markers in others. We have adopted language agnostic model trained in English to predict the discourse marker presence in texts in 8 other unseen by the model languages with the goal to evaluate how well the model performs in different structure and lexical properties languages. We report on the process of evaluation and validation of the model's performance across European Portuguese, Hebrew, German, Polish, Romanian, Bulgarian, Macedonian, and Lithuanian and about the results of this validation. This research is a key step towards multilingual language processing

Repositório Aberto da Universidade do Porto

Historiae, History of Socio-Cultural Transformation as Linguistic Data Science. A Humanities Use Case

Author: Apostol Elena-Simona
Armaselu Florentina
Fahad Khan Anas
Liebeskind Chaya
McGillivray Barbara
Truică Ciprian-Octavian
Valūnaitė Oleškevičienė Giedrė
Publication venue
Publication date: 01/01/2021
Field of study

The paper proposes an interdisciplinary approach including methods from disciplines such as history of concepts, linguistics, natural language processing (NLP) and Semantic Web, to create a comparative framework for detecting semantic change in multilingual historical corpora and generating diachronic ontologies as linguistic linked open data (LLOD). Initiated as a use case (UC4.2.1) within the COST Action Nexus Linguarum, European network for Web-centred linguistic data science, the study will explore emerging trends in knowledge extraction, analysis and representation from linguistic data science, and apply the devised methodology to datasets in the humanities to trace the evolution of concepts from the domain of socio-cultural transformation. The paper will describe the main elements of the methodological framework and preliminary planning of the intended workflow

Mykolas Romeris University Institutional Repository

LLODIA: A Linguistic Linked Open Data Model for Diachronic Analysis

Author: Apostol Elena-Simona
ARMASELU Florentina
Gifu Daniela
Liebeskind Chaya
Marongiu Paola
McGillivray Barbara
Truica Ciprian-Octavian
Valūnaitė-Oleškevičienė
Publication venue
Publication date: 31/05/2024
Field of study

editorial reviewedThis article proposes a linguistic linked open data model for diachronic analysis (LLODIA) that combines data derived from diachronic analysis of multilingual corpora with dictionary-based evidence. A humanities use case was devised as a proof of concept that includes examples in five languages (French, Hebrew, Latin, Lithuanian and Romanian) related to various meanings of the term “revolution” considered at different time intervals. The examples were compiled through diachronic word embedding and dictionary alignment

Open Repository and Bibliography - Luxembourg