4,795 research outputs found
Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health
Linking clinical narratives to standardized vocabularies and coding systems
is a key component of unlocking the information in medical text for analysis.
However, many domains of medical concepts lack well-developed terminologies
that can support effective coding of medical text. We present a framework for
developing natural language processing (NLP) technologies for automated coding
of under-studied types of medical information, and demonstrate its
applicability via a case study on physical mobility function. Mobility is a
component of many health measures, from post-acute care and surgical outcomes
to chronic frailty and disability, and is coded in the International
Classification of Functioning, Disability, and Health (ICF). However, mobility
and other types of functional activity remain under-studied in medical
informatics, and neither the ICF nor commonly-used medical terminologies
capture functional status terminology in practice. We investigated two
data-driven paradigms, classification and candidate selection, to link
narrative observations of mobility to standardized ICF codes, using a dataset
of clinical narratives from physical therapy encounters. Recent advances in
language modeling and word embedding were used as features for established
machine learning models and a novel deep learning approach, achieving a macro
F-1 score of 84% on linking mobility activity reports to ICF codes. Both
classification and candidate selection approaches present distinct strengths
for automated coding in under-studied domains, and we highlight that the
combination of (i) a small annotated data set; (ii) expert definitions of codes
of interest; and (iii) a representative text corpus is sufficient to produce
high-performing automated coding systems. This study has implications for the
ongoing growth of NLP tools for a variety of specialized applications in
clinical care and research.Comment: Updated final version, published in Frontiers in Digital Health,
https://doi.org/10.3389/fdgth.2021.620828. 34 pages (23 text + 11
references); 9 figures, 2 table
Archetype development and governance methodologies for the electronic health record
[ES] La interoperabilidad semántica de la información sanitaria es un requisito imprescindible para la sostenibilidad de la atención sanitaria, y es fundamental para afrontar los nuevos retos sanitarios de un mundo globalizado. Esta tesis aporta nuevas metodologías para abordar algunos de los aspectos fundamentales de la interoperabilidad semántica, específicamente aquellos relacionados con la definición y gobernanza de modelos de información clínica expresados en forma de arquetipo.
Las aportaciones de la tesis son:
- Estudio de las metodologías de modelado existentes de componentes de interoperabilidad semántica que influirán en la definición de una metodología de modelado de arquetipos.
- Análisis comparativo de los sistemas e iniciativas existentes para la gobernanza de modelos de información clínica.
- Una propuesta de Metodología de Modelado de Arquetipos unificada que formalice las fases de desarrollo del arquetipo, los participantes requeridos y las buenas prácticas a seguir.
- Identificación y definición de principios y características de gobernanza de arquetipos.
- Diseño y desarrollo de herramientas que brinden soporte al modelado y la gobernanza de arquetipos.
Las aportaciones de esta tesis se han puesto en práctica en múltiples proyectos y experiencias de desarrollo. Estas experiencias varían desde un proyecto local dentro de una sola organización que requirió la reutilización de datos clínicos basados en principios de interoperabilidad semántica, hasta el desarrollo de proyectos de historia clínica electrónica de alcance nacional.[CA] La interoperabilitat semàntica de la informació sanitària és un requisit imprescindible per a la sostenibilitat de l'atenció sanitària, i és fonamental per a afrontar els nous reptes sanitaris d'un món globalitzat. Aquesta tesi aporta noves metodologies per a abordar alguns dels aspectes fonamentals de la interoperabilitat semàntica, específicament aquells relacionats amb la definició i govern de models d'informació clínica expressats en forma d'arquetip. Les aportacions de la tesi són: - Estudi de les metodologies de modelatge existents de components d'interoperabilitat semàntica que influiran en la definició d'una metodologia de modelatge d'arquetips. - Anàlisi comparativa dels sistemes i iniciatives existents per al govern de models d'informació clínica. - Una proposta de Metodologia de Modelatge d'Arquetips unificada que formalitza les fases de desenvolupament de l'arquetip, els participants requerits i les bones pràctiques a seguir. - Identificació i definició de principis i característiques de govern d'arquetips. - Disseny i desenvolupament d'eines que brinden suport al modelatge i al govern d'arquetips. Les aportacions d'aquesta tesi s'han posat en pràctica en múltiples projectes i experiències de desenvolupament. Aquestes experiències varien des d'un projecte local dins d'una sola organització que va requerir la reutilització de dades clíniques basades en principis d'interoperabilitat semàntica, fins al desenvolupament de projectes d'història clínica electrònica d'abast nacional.[EN] Semantic interoperability of health information is an essential requirement for the sustainability of healthcare, and it is essential to face the new health challenges of a globalized world. This thesis provides new methodologies to tackle some of the fundamental aspects of semantic interoperability, specifically those aspects related to the definition and governance of clinical information models expressed in the form of archetypes.
The contributions of the thesis are:
- Study of existing modeling methodologies of semantic interoperability components that will influence in the definition of an archetype modeling methodology.
- Comparative analysis of existing clinical information model governance systems and initiatives.
- A proposal of a unified Archetype Modeling Methodology that formalizes the phases of archetype development, the required participants, and the good practices to be followed.
- Identification and definition of archetype governance principles and characteristics.
- Design and development of tools that provide support to archetype modeling and governance.
The contributions of this thesis have been put into practice in multiple projects and development experiences. These experiences vary from a local project inside a single organization that required a reuse on clinical data based on semantic interoperability principles, to the development of national electronic health record projects.This thesis was partially funded by the Ministerio de Economía y Competitividad, ayudas para contratos para la formación de doctores en empresas “Doctorados Industriales”, grant DI-14-06564 and by the Agencia Valenciana de la Innovación, ayudas del Programa de Promoción del Talento – Doctorados empresariales (INNODOCTO), grant INNTA3/2020/12.Moner Cano, D. (2021). Archetype development and governance methodologies for the electronic health record [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16491
Medical Informatics
Information technology has been revolutionizing the everyday life of the common man, while medical science has been making rapid strides in understanding disease mechanisms, developing diagnostic techniques and effecting successful treatment regimen, even for those cases which would have been classified as a poor prognosis a decade earlier. The confluence of information technology and biomedicine has brought into its ambit additional dimensions of computerized databases for patient conditions, revolutionizing the way health care and patient information is recorded, processed, interpreted and utilized for improving the quality of life. This book consists of seven chapters dealing with the three primary issues of medical information acquisition from a patient's and health care professional's perspective, translational approaches from a researcher's point of view, and finally the application potential as required by the clinicians/physician. The book covers modern issues in Information Technology, Bioinformatics Methods and Clinical Applications. The chapters describe the basic process of acquisition of information in a health system, recent technological developments in biomedicine and the realistic evaluation of medical informatics
Knowledge Organization Systems (KOS) in the Semantic Web: A Multi-Dimensional Review
Since the Simple Knowledge Organization System (SKOS) specification and its
SKOS eXtension for Labels (SKOS-XL) became formal W3C recommendations in 2009 a
significant number of conventional knowledge organization systems (KOS)
(including thesauri, classification schemes, name authorities, and lists of
codes and terms, produced before the arrival of the ontology-wave) have made
their journeys to join the Semantic Web mainstream. This paper uses "LOD KOS"
as an umbrella term to refer to all of the value vocabularies and lightweight
ontologies within the Semantic Web framework. The paper provides an overview of
what the LOD KOS movement has brought to various communities and users. These
are not limited to the colonies of the value vocabulary constructors and
providers, nor the catalogers and indexers who have a long history of applying
the vocabularies to their products. The LOD dataset producers and LOD service
providers, the information architects and interface designers, and researchers
in sciences and humanities, are also direct beneficiaries of LOD KOS. The paper
examines a set of the collected cases (experimental or in real applications)
and aims to find the usages of LOD KOS in order to share the practices and
ideas among communities and users. Through the viewpoints of a number of
different user groups, the functions of LOD KOS are examined from multiple
dimensions. This paper focuses on the LOD dataset producers, vocabulary
producers, and researchers (as end-users of KOS).Comment: 31 pages, 12 figures, accepted paper in International Journal on
Digital Librarie
Broadening horizons: the case for capturing function and the role of health informatics in its use
Background
Human activity and the interaction between health conditions and activity is a critical part of understanding the overall function of individuals. The World Health Organization’s International Classification of Functioning, Disability and Health (ICF) models function as all aspects of an individual’s interaction with the world, including organismal concepts such as individual body structures, functions, and pathologies, as well as the outcomes of the individual’s interaction with their environment, referred to as activity and participation. Function, particularly activity and participation outcomes, is an important indicator of health at both the level of an individual and the population level, as it is highly correlated with quality of life and a critical component of identifying resource needs. Since it reflects the cumulative impact of health conditions on individuals and is not disease specific, its use as a health indicator helps to address major barriers to holistic, patient-centered care that result from multiple, and often competing, disease specific interventions. While the need for better information on function has been widely endorsed, this has not translated into its routine incorporation into modern health systems.
Purpose
We present the importance of capturing information on activity as a core component of modern health systems and identify specific steps and analytic methods that can be used to make it more available to utilize in improving patient care. We identify challenges in the use of activity and participation information, such as a lack of consistent documentation and diversity of data specificity and representation across providers, health systems, and national surveys. We describe how activity and participation information can be more effectively captured, and how health informatics methodologies, including natural language processing (NLP), can enable automatically locating, extracting, and organizing this information on a large scale, supporting standardization and utilization with minimal additional provider burden. We examine the analytic requirements and potential challenges of capturing this information with informatics, and describe how data-driven techniques can combine with common standards and documentation practices to make activity and participation information standardized and accessible for improving patient care.
Recommendations
We recommend four specific actions to improve the capture and analysis of activity and participation information throughout the continuum of care: (1) make activity and participation annotation standards and datasets available to the broader research community; (2) define common research problems in automatically processing activity and participation information; (3) develop robust, machine-readable ontologies for function that describe the components of activity and participation information and their relationships; and (4) establish standards for how and when to document activity and participation status during clinical encounters. We further provide specific short-term goals to make significant progress in each of these areas within a reasonable time frame
Automated coding of under-studied medical concept domains: linking physical activity reports to the international classification of functioning, disability, and health
Linking clinical narratives to standardized vocabularies and coding systems is a key component of unlocking the information in medical text for analysis. However, many domains of medical concepts, such as functional outcomes and social determinants of health, lack well-developed terminologies that can support effective coding of medical text. We present a framework for developing natural language processing (NLP) technologies for automated coding of medical information in under-studied domains, and demonstrate its applicability through a case study on physical mobility function. Mobility function is a component of many health measures, from post-acute care and surgical outcomes to chronic frailty and disability, and is represented as one domain of human activity in the International Classification of Functioning, Disability, and Health (ICF). However, mobility and other types of functional activity remain under-studied in the medical informatics literature, and neither the ICF nor commonly-used medical terminologies capture functional status terminology in practice. We investigated two data-driven paradigms, classification and candidate selection, to link narrative observations of mobility status to standardized ICF codes, using a dataset of clinical narratives from physical therapy encounters. Recent advances in language modeling and word embedding were used as features for established machine learning models and a novel deep learning approach, achieving a macro-averaged F-1 score of 84% on linking mobility activity reports to ICF codes. Both classification and candidate selection approaches present distinct strengths for automated coding in under-studied domains, and we highlight that the combination of (i) a small annotated data set; (ii) expert definitions of codes of interest; and (iii) a representative text corpus is sufficient to produce high-performing automated coding systems. This research has implications for continued development of language technologies to analyze functional status information, and the ongoing growth of NLP tools for a variety of specialized applications in clinical care and research
Semantic annotation of clinical questionnaires to support personalized medicine
Tese de Mestrado, Bioinformática e Biologia Computacional, 2022, Universidade de Lisboa, Faculdade de CiênciasAtualmente estamos numa era global de constante evolução tecnológica, e uma das
áreas que têm beneficiado com isso é a medicina, uma vez que com integração da vertente
tecnológica na medicina, tem vindo a ter um papel cada vez mais importante quer do
ponto de vista dos médicos quer do ponto de vista dos pacientes.
Como resultado de melhores ferramentas que permitam melhorar o exercício das
funções dos médicos, estão se a criar condições para que os pacientes possam ter um
melhor acompanhamento, entendimento e atualização em tempo real da sua condição
clínica.
O setor dos Cuidados de Saúde é responsável pelas novidades que surgem quase
diariamente e que permitem melhorar a experiência do paciente e o modo como os
médicos podem tirar proveito da informação que os dados contêm em prol de uma
validação mais célere e eficaz. Este setor tem gerado um volume cada vez mais maciço
de dados, entre os quais relatórios médicos, registos de sensores inerciais, gravações de
consultas, imagens, vídeos e avaliações médicas nas quais se inserem os questionários e
as escalas clínicas que prometem aos pacientes um melhor acompanhamento do seu
estado de saúde, no entanto o seu enorme volume, distribuição e a grande
heterogeneidade dificulta o processamento e análise.
A integração deste tipo de dados é um desafio, uma vez que têm origens em diversas
fontes e uma heterogeneidade semântica bastante significativa; a integração semântica de
dados biomédicos resulta num desenvolvimento de uma rede semântica biomédica que
relaciona conceitos entre diversas fontes o que facilita a tradução de descobertas
científicas ajudando na elaboração de análises e conclusões mais complexas para isso é
crucial que se atinja a interoperabilidade semântica dos dados. Este é um passo muito
importante que permite a interação entre diferentes conjuntos de dados clínicos dentro do
mesmo sistema de informação ou entre sistemas diferentes. Esta integração permite às
ferramentas de análise e interface com os dados trabalhar sobre uma visão integrada e
holística dos dados, o que em última análise permite aos clínicos um acompanhamento
mais detalhado e personalizado dos seus pacientes.
Esta dissertação foi desenvolvida no LASIGE e em colaboração com o Campus
Neurológico Sénior e faz parte de um grande projeto que explora o fornecimento de mais e melhores dados tanto a clínicos como a pacientes. A base deste projeto assenta numa
aplicação web, o DataPark que possui uma plataforma que permite ao utilizador navegar
por áreas clinicas entre as quais a nutrição, fisioterapia, terapia ocupacional, terapia da
fala e neuropsicologia, em que cada uma delas que alberga baterias de testes com diversos
questionários e escalas clínicas de avaliação. Este tipo de avaliação clínica facilita imenso
o trabalho do médico uma vez que permite que sejam implementadas à distância uma vez
que o paciente pode responder remotamente, estas respostas ficam guardadas no
DataPark permitindo ao médico fazer um rastreamento do status do paciente ao longo do
tempo em relação a uma determinada escala.
No entanto o modo como o DataPark foi desenvolvido limita uma visão do médico
orientada ao questionário, ou seja o médico que acompanha o paciente quando quer ter a
visão do mesmo como um todo tem esta informação espalhada e dividida por estes
diferentes questionários e tem de os ir ver a todos um a um para ter a noção do status do
paciente. Esta dissertação pretende fazer face a este desafio construindo um algoritmo
que decomponha todas as perguntas dos diferentes questionários e permita a sua
integração semântica. Isto com o objectivo de permitir ao médico ter um visão holística
orientada por conceito clínico.
Procedeu-se então à extração de toda a base de dados presente no DataPark, sendo
esta a fonte de dados sobre a qual este trabalho se baseou, frisando que originalmente
existem muitos dados em Português que terão de ser traduzidos automaticamente.
Com uma análise de alto nível (numa fase inicial) sobre os questionários da base
de dados, iniciou-se a construção de um modelo semântico que pudesse descrever os
dados presentes nos questionários e escalas. Assim de uma forma manual foi feito um
levantamento de todos os conceitos clínicos que se conseguiu identificar num sub conjunto de questionários, mais concretamente 15 com os 5 mais respondidos em relação
à Doença de parkinson, os 5 mais respondidos em relação à doença de AVC e os 5 mais
respondidos que não estejam associados a uma única patologia em específico. Este
modelo foi melhorado e evoluiu em conjunto com uma equipa de 12 médicos e terapeutas
do CNS ao longo de 7 reuniões durante as quais foi levado a cabo um workshop de
validação que permitiu dotar o modelo construído de uma fiabilidade elevada.
Em paralelo procedeu-se à elaboração de 2 estudo: (i) um estudo que consistia em
avaliar com qual ou quais ontologias se obtém a maior cobertura dos dados do sub conjunto de 15 questionários. A conclusão a que se chegou foi que o conjunto de
ontologias que nos conferia mais segurança é constituído pelas ontologias LOINC, NCIT,
SNOMED e OCHV, conjunto esse foi utilizado daqui em diante; (ii) outro estudo
procurou aferir qual a ferramenta de tradução automática(Google Translator ou Microsoft
Translator) que confere uma segurança maior, para isso procedeu-se à tradução completa de 3 questionários que apesar de estar na base de dados no idioma português, tem a sua
versão original em inglês. Isto permitiu-nos traduzir estes 3 questionários de português
para inglês e avaliar em qual das duas ferramentas se obteve uma melhor performance.
O Microsoft Translator apresentou com uma diferença pequena um desempenho superior,
sendo portanto a ferramenta de tradução automática escolhida para integrar o nosso
algoritmo.
Concluídos estes 2 estudos temos assim o conjunto de dados uniformizado numa
só linguagem, e o conjunto de ontologias escolhidas para a anotação semântica. Para
entender esta fase do trabalho há que entender que ontologias são poderosas ferramentas
computacionais que consistem num conjunto de conceitos ou termos, que nomeiam e
definem as entidades presentes num certo domínio de interesse, no ramo da biomedicina
são designadas por ontologias biomédicas.
O uso de ontologias biomédicas confere uma grande utilidade na partilha,
recuperação e na extração de informação na biomedicina tendo um papel crucial para a
interoperabilidade semântica que é exatamente o nosso objectivo final.
Assim sendo procedeu-se à anotação semântica das questões do sub-conjunto de
15 questionários, uma anotação semântica é um processo que associa formalmente o alvo
textual a um conceito/termo, podendo estabelecer desta forma pontes entre
documentos/texto-alvos diferentes que abordam o mesmo conceito. Ou seja, uma
anotação semântica é associar um termo de uma determinada ontologia a um conceito
presente no texto alvo. Imaginando que o texto alvo são diferentes perguntas de vários
questionários, é natural encontrar diferentes questões de diferentes áreas de diagnóstico
que estejam conectados por termos ontológicos em comum.
Depois da anotação completada é feita a integração do modelo semântico, com o
algoritmo desenvolvido com o conjunto de ontologias e ainda com os dados dos
pacientes. Desta forma sabemos que um determinado paciente respondeu a várias
perguntas que abordam um mesmo conceito, essas perguntas estão interligadas
semanticamente uma vez que têm o mesmo conceito mapeado.
A nível de performance geral tanto os processos tradução como de anotação tiveram
um desempenho aceitável, onde a nivel de tradução se atingiu 78% accuracy, 76% recall
e uma F-mesure de 0.77 e ao nível da performance de anotação obteve-se 87% de
anotações bem conseguidas. Portanto num cômputo geral consegue-se atingir o principal
objectivo que era a obtenção holística integrada com o modelo semântico e os dados do
DataPark(Questionários e pacientes).Healthcare is a multi-domain area, with professionals from different areas often
collaborating to provide patients with the best possible care. Neurological and
neurodegenerative diseases are especially so, with multiple areas, including neurology,
psychology, nursing, physical therapy, speech therapy and others coming together to
support these patients.
The DataPark application allows healthcare providers to store, manage and analyse
information about patients with neurological disorders from different perspectives
including evaluation scales and questionnaires. However, the application does not
provide a holistic view of the patient status because it is split across different domains
and clinical scales.
This work proposes a methodology for the semantic integration of this data. It
developed the data scaffolding to afford a holistic view of the patient status that is
concept-oriented rather than scale or test battery oriented. A semantic model was
developed in collaboration with healthcare providers from different areas, which was
subsequently aligned with existing biomedical ontologies. The questionnaire and scale
data was semantically annotated to this semantic model, with a translation step when the
original data was in Portuguese. The process was applied to a subset of 15 scales with a
manual evaluation of each process. The semantic model includes 204 concepts and 436
links to external ontologies. Translation achieved an accuracy of 78%, whereas the
semantic annotation achieved 87%. The final integrated dataset covers 443 patients.
Finally, applying the process of semantic annotation to the whole dataset,
conditions are created for the process of semantic integration to occur, this process
consists in crossing all questions from different questionnaires and establishing a
connection between those that contain the same annotation.
This work allows healthcare providers to assess patients in a more global fashion,
integrating data collected from different scales and test batteries that evaluate the same
or similar parameters
Doctor of Philosophy
dissertationThe problem of information transfer between healthcare sectors and across the continuum of care was examined using a mixed methods approach. These methods include qualitative interviews, retrospective case reviews and an informatic gap analysis. Findings and conclusions are reported for each study. Qualitative interviews were conducted with 16 healthcare representatives from 4 disciplines (medicine, pharmacy, nursing, and social work) and 3 healthcare sectors (hospital, skilled nursing care and community care). Three key themes from a Joint Cognitive Systems theoretical model were used to examine qualitative findings. Agreement on cross-sector care goals is neither defined nor made explicit and in some instances working at cross purposes. Care goals and information paradigms change as patients move from hospitalbased crisis stabilization, diagnosis and treatment to a postdischarge care to home or skilled nursing recovery, function restoration, or end of life support. Control of the transfer process is variable across institutions with little feedback and feed-forward. Lack of knowledge, competency and information tracking threatens sector interdependencies with suspicion and distrust. Sixty-three patients discharged between 2006 and 2008 from hospitals to skilled nursing facilities were randomly selected and reviewed. Most notably missing are discharge summaries (30%), nursing assessments or notes (17%), and social work documents (25%). Advanced directives or living wills necessary for end of life support were present in only 6% of the cases. The presence of information on activities of daily living (ADLs), other disabling conditions, and nutrition was associated with positive outcomes at the 0.001, 0.04 and 0.08levels. Consistent geriatric information transfer across the continuum is needed for relevant care management. An interoperability gap analysis conducted on the LINC (Linking Information Necessary for Care) transfer form determined its interoperability to be the semantic level 0. Detailed Clinical Models representing care management processes are challenged by the lack of consensus in terminology standards across sectors. Construction of information transfer solutions compliant with the Centers of Medicare and Medicaid Services (CMS) Stage 2 meaningful use criteria must address syntactic and semantic standards, map sector terminologies within care management processes, and account for the lack of standard terminologies in allied health domains
Proceedings
Proceedings of the Workshop
CHAT 2011: Creation, Harmonization and Application of Terminology Resources.
Editors: Tatiana Gornostay and Andrejs Vasiļjevs.
NEALT Proceedings Series, Vol. 12 (2011).
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/16956
- …