52 research outputs found
Advancements in eHealth Data Analytics through Natural Language Processing and Deep Learning
The healthcare environment is commonly referred to as "information-rich" but
also "knowledge poor". Healthcare systems collect huge amounts of data from
various sources: lab reports, medical letters, logs of medical tools or
programs, medical prescriptions, etc. These massive sets of data can provide
great knowledge and information that can improve the medical services, and
overall the healthcare domain, such as disease prediction by analyzing the
patient's symptoms or disease prevention, by facilitating the discovery of
behavioral factors for diseases. Unfortunately, only a relatively small volume
of the textual eHealth data is processed and interpreted, an important factor
being the difficulty in efficiently performing Big Data operations. In the
medical field, detecting domain-specific multi-word terms is a crucial task as
they can define an entire concept with a few words. A term can be defined as a
linguistic structure or a concept, and it is composed of one or more words with
a specific meaning to a domain. All the terms of a domain create its
terminology. This chapter offers a critical study of the current, most
performant solutions for analyzing unstructured (image and textual) eHealth
data. This study also provides a comparison of the current Natural Language
Processing and Deep Learning techniques in the eHealth context. Finally, we
examine and discuss some of the current issues, and we define a set of research
directions in this area
Modelling frequency, attestation, and corpus-based information with OntoLex-FrAC
OntoLex-Lemon has become a de facto standard for lexical resources in the web of data. This paper provides the first overall description of the emerging OntoLex module for Frequency, Attestations, and Corpus-Based Information (OntoLex-FrAC) that is intended to complement OntoLex-Lemon with the necessary vocabulary to represent major types of information found in or automatically derived from corpora, for applications in both language technology and the language sciences
The Forgotten Document-Oriented Database Management Systems: An Overview and Benchmark of Native XML DODBMSes in Comparison with JSON DODBMSes
In the current context of Big Data, a multitude of new NoSQL solutions for
storing, managing, and extracting information and patterns from semi-structured
data have been proposed and implemented. These solutions were developed to
relieve the issue of rigid data structures present in relational databases, by
introducing semi-structured and flexible schema design. As current data
generated by different sources and devices, especially from IoT sensors and
actuators, use either XML or JSON format, depending on the application,
database technologies that store and query semi-structured data in XML format
are needed. Thus, Native XML Databases, which were initially designed to
manipulate XML data using standardized querying languages, i.e., XQuery and
XPath, were rebranded as NoSQL Document-Oriented Databases Systems. Currently,
the majority of these solutions have been replaced with the more modern JSON
based Database Management Systems. However, we believe that XML-based solutions
can still deliver performance in executing complex queries on heterogeneous
collections. Unfortunately nowadays, research lacks a clear comparison of the
scalability and performance for database technologies that store and query
documents in XML versus the more modern JSON format. Moreover, to the best of
our knowledge, there are no Big Data-compliant benchmarks for such database
technologies. In this paper, we present a comparison for selected
Document-Oriented Database Systems that either use the XML format to encode
documents, i.e., BaseX, eXist-db, and Sedna, or the JSON format, i.e., MongoDB,
CouchDB, and Couchbase. To underline the performance differences we also
propose a benchmark that uses a heterogeneous complex schema on a large DBLP
corpus.Comment: 28 pages, 6 figures, 7 table
Post-traumatic humerus non-union treatment using fibular bone graft in a pediatric patient – case report
Clinica de Chirurgie și Ortopedie Pediatrică, Spitalul Clinic de Urgență pentru Copii ”Sfânta Maria”, Iași, România, Al XIII-lea Congres al Asociației Chirurgilor „Nicolae Anestiadi” și
al III-lea Congres al Societății de Endoscopie, Chirurgie miniminvazivă și Ultrasonografie ”V.M.Guțu” din Republica MoldovaIntroducere: Pseudartroza de humerus prezintă o incidență de 8-12% în rândul populației pediatrice. Există o varietate largă de
opțiuni terapeutice, principiile de tratament bazându-se pe reducere deschisă și osteosinteză sau utilizare de grefon osos, în funcție
de caz.
Material și metode: Pacientă în vârstă de 15 ani,victimă a unui accident rutier, s-a internat în urgență pentru traumatisme multiple,
printre diagnosticele stabilite fiind fractura diafizei humerale stângi, pentru care s-a practicat reducere deschisă și osteosinteză cu
placă și șuruburi. La un an postoperator s-a constatat pseudartroză la nivelul focarului de fractură.
Rezultate obținute: S-au utilizat numeroase metode de tratament pentru cura pseudartrozei: montarea de fixator extern, injectare
perilezională cu factori de creștere, o nouă osteosinteză cu placă și șuruburi și plombajul defectului osos cu ceramică bifazică. Întrucât
pseudartroza a persistat, s-a optat pentru o nouă intervenție cu utilizarea grefonului osos peronier fixat centromedular la nivelul
humerusului, evoluția postoperatorie fiind favorabilă.
Concluzie: Pseudartroza de humerus rămâne una dintre cele mai dificile complicații ale fracturii de humerus din cauza frecvenței
sale și a dificultăților de management terapeutic. În cazul de față, tratamentul chirurgical utilizând grefon osos autolog a avut rezultate
optime, cu o rată bună de vindecare din punct de vedere anatomo-funcțional.Introduction: Humerus fracture non-union presents 8-12% of all the pediatric population. Treatment options are numerus and is
generally based on open reduction with internal fixation or using bone graft depending on each case individually.
Material and methods: 15 years old female patient, presented with road traffic crash, was admitted with multiple trauma. Left humerus
shaft fracture was one of the established diagnosis, open reduction with internal fixation was performed using a plate and screws. 1
year postoperatively non-union was noticed on the check X-rays at the fracture site.
Results: Multiple methods were used for non-union treatment: external fixator, growth factor hormone injection at fracture site, second
open reduction with internal fixation attempt and biomaterial usage for bone loss. Fracture non-union persisted despite the usage of
the mentioned methods. Decision was taken to use fibular bone graft as an intramedullary fixation of the humerus. Postoperative result
and follow up were satisfactory.
Conclusion: Humeral fracture non-union is one of the worse complications due to its frequency and difficult therapeutic management.
In this presented case, surgical treatment using autologous bone graft reflected in optimal results, high anatomo-functional healing
results
Izrada OWL ontologije za prikaz, povezivanje i pretraživanje SemAF diskursnih oznaka
Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale. The main objective of this paper is to demonstrate how LLOD technologies can be applied to represent and annotate a corpus composed of multiword discourse markers, and what the effects of this are. In particular, it is our aim to apply semantic web standards such as RDF and OWL for publishing and integrating data. We present a novel scheme for discourse annotation that combines ISO standards describing discourse relations and dialogue acts – ISO DR-Core (ISO 24617-8) and ISO-Dialogue Acts (ISO 24617-2) in 9 languages (cf. Silvano and Damova 2022; Silvano, et al. 2022). We develop an OWL ontology to formalize that scheme, provide a newly annotated dataset and link its RDF edition with the ontology. Consequently, we describe the conjoint querying of the ontology and the annotations by means of SPARQL, the standard query language for the web of data. The ultimate result is that we are able to perform queries over multiple, interlinked datasets with complex internal structure. This is a first, but essential step, in developing novel, powerful, and groundbreaking means for the corpus-based study of multilingual discourse, communication analysis, or attitudes discovery.Diskursni markeri jezični su znakovi koji pokazuju kako se iskaz odnosi na kontekst diskursa i koju ulogu ima u razgovoru. Lingvistički povezani otvoreni podatci (LLOD) tehnologije su u nastajanju koje omogućuju snažan instrument za prikaz i tumačenje jezičnih fenomena na razini weba. Glavni je cilj ovoga rada pokazati kako se tehnologije lingvistički povezanih otvorenih podataka (LLOD) mogu primijeniti za prikaz i označavanje korpusa višerječnih diskursnih markera te koji su učinci toga. Konkretno, naš je cilj primijeniti standarde semantičkoga weba kao što su RDF i Web Ontology Language (OWL) za objavljivanje i integraciju podataka. Autori predstavljaju novu shemu za označavanje diskursa koja kombinira ISO standarde za opis diskursnih odnosa i dijaloških činova – ISO DR-Core (ISO 24617-8) i ISO-Dialogue Acts (ISO 24617-2) na devet jezika (usp. Silvano, Purificação et al. 2022a; Silvano, Purificação et al. 2022b). Razvijamo OWL ontologiju kako bismo formalizirali tu shemu, pružili nov označeni skup podataka i povezali njegovu RDF inačicu s ontologijom. U skladu s tim opisujemo zajedničko postavljanje upita ontologiji i oznakama s pomoću SPARQL-a, standardnoga jezika upita za web podataka. Konačni je rezultat taj da možemo izvršiti upite nad višestrukim, međusobno povezanim skupovima podataka sa složenom unutarnjom strukturom bez potrebe za ikakvim specijaliziranim softverom. Umjesto toga upotrebljavaju se gotove tehnologije utemeljene na web standardima koje se bez napora mogu prenijeti na različite operativne sustave, baze podataka i programske jezike. Ovo je prvi, ali prijeloman korak u razvoju novih, snažnih i (u određenom trenutku) pristupačnih sredstava za korpusno utemeljena istraživanja višejezičnoga diskursa te za analizu komunikacije i otkrivanje stavova
An OWL ontology for ISO-based discourse marker annotation
Purpose: Discourse markers are linguistic cues that indicate how an utterance relates to the discourse context and what role it plays in conversation. The authors are preparing an annotated corpus in nine languages, and specifically aim to explore the role of Linguistic Linked Open Data (/LLOD) technologies in the process, i.e., the application of web standards such as RDF and the Web Ontology Language (OWL) for publishing and integrating data. We demonstrate the advantages of this approach
ISO-based annotated multilingual parallel corpus for discourse markers
Discourse markers carry information about the discourse structure and organization, and also signal local dependencies or
epistemological stance of speaker. They provide instructions on how to interpret the discourse, and their study is paramount
to understand the mechanism underlying discourse organization. This paper presents a new language resource, an ISO-based
annotated multilingual parallel corpus for discourse markers. The corpus comprises nine languages, Bulgarian, Lithuanian,
German, European Portuguese, Hebrew, Romanian, Polish, and Macedonian, with English as a pivot language. In order to
represent the meaning of the discourse markers, we propose an annotation scheme of discourse relations from ISO 24617-8
with a plug-in to ISO 24617-2 for communicative functions. We describe an experiment in which we applied the annotation
scheme to assess its validity. The results reveal that, although some extensions are required to cover all the multilingual data,
it provides a proper representation of discourse markers value. Additionally, we report some relevant contrastive phenomena
concerning discourse markers interpretation and role in discourse. This first step will allow us to develop deep learning methods
to identify and extract discourse relations and communicative functions, and to represent that information as Linguistic Linked
Open Data (LLOD)
Validation of language agnostic models for discourse marker detection
Using language models to detect or predict the
presence of language phenomena in the text has
become a mainstream research topic. With the
rise of generative models, experiments using
deep learning and transformer models trigger
intense interest. Aspects like precision of predictions,
portability to other languages or phenomena,
scale have been central to the research
community. Discourse markers, as language
phenomena, perform important functions, such
as signposting, signalling, and rephrasing, by
facilitating discourse organization. Our paper
is about discourse markers detection, a complex
task as it pertains to a language phenomenon
manifested by expressions that can occur as
content words in some contexts and as discourse
markers in others. We have adopted
language agnostic model trained in English to
predict the discourse marker presence in texts
in 8 other unseen by the model languages with
the goal to evaluate how well the model performs
in different structure and lexical properties
languages. We report on the process of
evaluation and validation of the model's performance
across European Portuguese, Hebrew,
German, Polish, Romanian, Bulgarian, Macedonian,
and Lithuanian and about the results
of this validation. This research is a key step
towards multilingual language processing
Historiae, History of Socio-Cultural Transformation as Linguistic Data Science. A Humanities Use Case
The paper proposes an interdisciplinary approach including methods from disciplines such as history of
concepts, linguistics, natural language processing (NLP) and Semantic Web, to create a comparative
framework for detecting semantic change in multilingual historical corpora and generating diachronic
ontologies as linguistic linked open data (LLOD). Initiated as a use case (UC4.2.1) within the COST
Action Nexus Linguarum, European network for Web-centred linguistic data science, the study will
explore emerging trends in knowledge extraction, analysis and representation from linguistic data
science, and apply the devised methodology to datasets in the humanities to trace the evolution
of concepts from the domain of socio-cultural transformation. The paper will describe the main
elements of the methodological framework and preliminary planning of the intended workflow
LLODIA: A Linguistic Linked Open Data Model for Diachronic Analysis
editorial reviewedThis article proposes a linguistic linked open data model for diachronic analysis (LLODIA) that combines data derived from diachronic analysis of multilingual corpora with dictionary-based evidence. A humanities use case was devised as a proof of concept that includes examples in five languages (French, Hebrew, Latin, Lithuanian and Romanian) related to various meanings of the term “revolution” considered at different time intervals. The examples were compiled through diachronic word embedding and dictionary alignment
- …