599 research outputs found

    Semantic Asset Administration Shells in Industry 4.0: A Survey

    Get PDF
    The Asset Administration Shell (AAS) is a fundamental concept in the Reference Architecture Model for Industry 4.0 (RAMI 4.0), that provides a virtual and digital representation of all information and functions of a physical asset in a manufacturing environment. Recently, Semantic AASs have emerged that add knowledge representation formalisms to enhance the digital representation of physical assets. In this paper, we provide a comprehensive survey of the scientific contributions to Semantic AASs that model the Information and Communication Layer within RAMI 4.0, and summarise and demonstrate their structure, communication, functionalities, and use cases. We also highlight the challenges of future development of Semantic AASs

    FIREMAP: Cloud-based software to automate the estimation of wildfire-induced ecological impacts and recovery processes using remote sensing techniques

    Get PDF
    [EN] The formulation and planning of integrated fire management strategies must be strengthened by decision support systems about fire-induced ecological impacts and ecosystem recovery processes, particularly in the context of extreme wildfire events that challenge land management initiatives. Wildfire data collection and analysis through remote sensing earth observations is of utmost importance for this purpose. However, the needs of land managers are not always met because the exploitation of the full potential of remote sensing techniques requires a high level of technical expertise. In addition, data acquisition and storage, database management, networking, and computing requirements may present technical difficulties. Here, we present FIREMAP software, which leverages the potential of Google Earth Engine (GEE) cloud-based platform, an intuitive graphical user interface (GUI), and the European Forest Fire Information System (EFFIS) wildfire database for wildfire analyses through remote sensing techniques and data collections. FIREMAP software allows automatic computing of (i) machine learning-based burned area (BA) detection algorithms to facilitate the mapping of (historical) fire perimeters, (ii) fire severity spectral indices, and (iii) post-fire recovery trajectories through the inversion of physically-based radiative transfer models. We introduce (i) the FIREMAP platform architecture and the GUI, (ii) the implementation of well-established algorithms for wildfire science and management in GEE, (iii) the validation of the algorithm implementation in fifteen case-study wildfires across the western Mediterranean Basin, and (iv) the near-future and long-term planned expansion of FIREMAP featuresSIThis study was financially supported by the Spanish Ministry of Science and Innovation in the framework of LANDSUSFIRE project (PID2022-139156OB-C21) within the National Program for the Promotion of Scientific-Technical Research (2021-2023), and with Next-Generation Funds of the European Union (EU) in the framework of the FIREMAP project (TED2021-130925B-I00); and by the Regional Government of Castile and León in the framework of the IA-FIREXTCyL project (LE081P23). Víctor Fernández-García was supported by a Margarita Salas post-doctoral fellowship from the Ministry of Universities of Spain, financed with European Union-NextGenerationEU and Ministerio de Universidades Fund

    Enriching information extraction pipelines in clinical decision support systems

    Get PDF
    Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01[Resumo] Os estudos sanitarios de múltiples centros son importantes para aumentar a repercusión dos resultados da investigación médica debido ao número de suxeitos que poden participar neles. Para simplificar a execución destes estudos, o proceso de intercambio de datos debería ser sinxelo, por exemplo, mediante o uso de bases de datos interoperables. Con todo, a consecución desta interoperabilidade segue sendo un tema de investigación en curso, sobre todo debido aos problemas de gobernanza e privacidade dos datos. Na primeira fase deste traballo, propoñemos varias metodoloxías para optimizar os procesos de estandarización das bases de datos sanitarias. Este traballo centrouse na estandarización de fontes de datos heteroxéneas nun esquema de datos estándar, concretamente o OMOP CDM, que foi desenvolvido e promovido pola comunidade OHDSI. Validamos a nosa proposta utilizando conxuntos de datos de pacientes con enfermidade de Alzheimer procedentes de distintas institucións. Na seguinte etapa, co obxectivo de enriquecer a información almacenada nas bases de datos de OMOP CDM, investigamos solucións para extraer conceptos clínicos de narrativas non estruturadas, utilizando técnicas de recuperación de información e de procesamento da linguaxe natural. A validación realizouse a través de conxuntos de datos proporcionados en desafíos científicos, concretamente no National NLP Clinical Challenges(n2c2). Na etapa final, propuxémonos simplificar a execución de protocolos de estudos provenientes de múltiples centros, propoñendo solucións novas para perfilar, publicar e facilitar o descubrimento de bases de datos. Algunhas das solucións desenvolvidas están a utilizarse actualmente en tres proxectos europeos destinados a crear redes federadas de bases de datos de saúde en toda Europa.[Resumen] Los estudios sanitarios de múltiples centros son importantes para aumentar la repercusión de los resultados de la investigación médica debido al número de sujetos que pueden participar en ellos. Para simplificar la ejecución de estos estudios, el proceso de intercambio de datos debería ser sencillo, por ejemplo, mediante el uso de bases de datos interoperables. Sin embargo, la consecución de esta interoperabilidad sigue siendo un tema de investigación en curso, sobre todo debido a los problemas de gobernanza y privacidad de los datos. En la primera fase de este trabajo, proponemos varias metodologías para optimizar los procesos de estandarización de las bases de datos sanitarias. Este trabajo se centró en la estandarización de fuentes de datos heterogéneas en un esquema de datos estándar, concretamente el OMOP CDM, que ha sido desarrollado y promovido por la comunidad OHDSI. Validamos nuestra propuesta utilizando conjuntos de datos de pacientes con enfermedad de Alzheimer procedentes de distintas instituciones. En la siguiente etapa, con el objetivo de enriquecer la información almacenada en las bases de datos de OMOP CDM, hemos investigado soluciones para extraer conceptos clínicos de narrativas no estructuradas, utilizando técnicas de recuperación de información y de procesamiento del lenguaje natural. La validación se realizó a través de conjuntos de datos proporcionados en desafíos científicos, concretamente en el National NLP Clinical Challenges (n2c2). En la etapa final, nos propusimos simplificar la ejecución de protocolos de estudios provenientes de múltiples centros, proponiendo soluciones novedosas para perfilar, publicar y facilitar el descubrimiento de bases de datos. Algunas de las soluciones desarrolladas se están utilizando actualmente en tres proyectos europeos destinados a crear redes federadas de bases de datos de salud en toda Europa.[Abstract] Multicentre health studies are important to increase the impact of medical research findings due to the number of subjects that they are able to engage. To simplify the execution of these studies, the data-sharing process should be effortless, for instance, through the use of interoperable databases. However, achieving this interoperability is still an ongoing research topic, namely due to data governance and privacy issues. In the first stage of this work, we propose several methodologies to optimise the harmonisation pipelines of health databases. This work was focused on harmonising heterogeneous data sources into a standard data schema, namely the OMOP CDM which has been developed and promoted by the OHDSI community. We validated our proposal using data sets of Alzheimer’s disease patients from distinct institutions. In the following stage, aiming to enrich the information stored in OMOP CDM databases, we have investigated solutions to extract clinical concepts from unstructured narratives, using information retrieval and natural language processing techniques. The validation was performed through datasets provided in scientific challenges, namely in the National NLP Clinical Challenges (n2c2). In the final stage, we aimed to simplify the protocol execution of multicentre studies, by proposing novel solutions for profiling, publishing and facilitating the discovery of databases. Some of the developed solutions are currently being used in three European projects aiming to create federated networks of health databases across Europe

    Google Earth Engine을 이용한 북한의 산림 황폐화 연구

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 농업생명과학대학 생태조경·지역시스템공학부(생태조경학), 2021.8. 이동근.산림 황폐화는 산림 생태계를 파괴하며 물 저장 및 공급과 대기오염을 줄이는 등 산림이 가지고 있는 기능을 저하시킨다. 황폐화로 인한 산림의 기능 저하는 기후변화 대응 및 대기질 측면에서 부정적인 영향을 미치게 된다. 북한은 세계 3개 산림 황폐지역으로 1990년대부터 최근까지 산림의 약 28%가 황폐화되었다는 국립 산림 과학원의 연구결과가 있다. 하지만 공인된 통계는 없어 추후 복원을 위해서는 정확한 현황 파악이 필요한 실정이다. 일반적인 산림 황폐화와는 달리 북한은 경제적인 어려움으로 인한 식량 부족과 에너지 자원의 부족으로 발생하였다. 식량 공급을 위하여 산림은 밭으로 개간되었고, 석탄의 부족으로 인하여 에너지원으로 사용하기 위한 무분별한 벌목이 진행되어 광범위함 산림 황폐화가 가속화되었다. 산림 황폐화의 문제점은 북한에서도 인식하여 관련 정책을 진행하는 등의 노력을 하였지만, 지속되는 경제난과 한국과의 관계 악화로 인하여 효과적으로 이루어지지 않고 있다. 북한의 산림 황폐화는 북한뿐 아니라 한반도의 환경과 동북아에 사회 경제적으로 영향을 주고 있기 때문에 복원이 시급한 상황이며 추후 한국과의 관계가 개선되었을 때 효과적인 복원 사업 지원을 위해서는 정확한 현황과 규모를 파악하는 것이 중요하다. 북한은 현재 접근 불가 지역으로 현장조사를 통한 현황 파악이 불가능하기 때문에 위성영상을 사용한 원격탐사가 가장 효과적인 방법이다. 또한 산림 황폐화는 단기간에 나타나는 현상이 아니라 장기간에 걸쳐 진행되는 현상이기 때문에 다중시기로 분석할 필요가 있다. 따라서 본 연구에서는 북한의 산림 황폐화가 심화되기 시작한 1990년대 이후인 2000년부터 가장 최근인 2020년까지 20년 동안의 북한 산림 황폐화 현황을 파악하는 것을 기본으로 두 가지 연구 가설을 세워 리를 확인하고, 황폐화 진행이 얼마나 되었는지, 복원사업의 성과가 있었는지 살펴보고자 한다. 이를 통해 추후 복원 사업을 진행할 때, 체계적인 계획을 세울 수 있는 기초자료로 쓸 수 있도록 하는 것이 연구 목표이다. 이를 위하여 미국의 지리정보 플랫폼인 Google Earth Engine을 통하여 픽셀 기반 감독 분류 랜덤 포레스트(Random Forest) 방법을 사용하여 토지 피복 분류를 진행하고, 이를 기반으로 Change Detection(변화 감지)을 하여 어느 지역에서 황폐화가 진행되었는지, 산림 면적이 얼마나 변화하였는지 살펴보았다. 분석을 진행한 결과, 2000년-2010년 동안 북한의 산림 비율은 전체 면적의 약 72.5%에서 약 61%로 약 11.5% 정도 감소한 것으로 나타났다. 이와 반면에 농지와 나지의 비율은 각각 약 7%, 약 2% 증가한 것으로 나타나 무분별한 벌채와 개간으로 인한 산림 황폐화가 심각하다는 것을 보여준다. 변화가 가장 많이 나타난 지역은 평안도, 함경도, 강원도 지역으로 나타났으며, 변화가 가장 적게 나타난 지역은 황해도 지역으로 나타났다. 2010년-2020년 동안의 북한의 산림 비율은 약 61%에서 약 62%로 약 1%정도 증가하였으며, 농지도 약 3% 증가하였다. 이와 반면에 나지 비율은 약 4% 감소하여 본격적인 산림 복원 사업을 시작한 2016년 이후 산림 비율이 약간 상승하고 나지 비율이 감소하였으나 농지 비율이 증가한 것으로 보아 산림 복원이 성공적으로 이루어지지 않았으며, 무분별한 개간 또한 지속되고 있다는 것을 보여준다. 변화가 가장 크게 일어난 지역은 황해도, 함경도 강원도 지역으로 나타났으며, 변화가 가장 적게 일어난 지역은 평안도 지역으로 나타났다. 20년 동안 공통적으로 변화가 많이 일어난 지역은 함경도 강원도로, 분석결과를 통해 이 지역에서 개간과 벌채가 많이 일어났음을 알 수 있다.Deforestation destroys forest ecosystems and reduces the functions of forests, such as reducing water storage and supply and air pollution. The degradation of forests due to deforestation harms climate change response and air quality. North Korea is one of the world's three deforested areas, and according to the research results of the National Institute of Forestry and Science, about 28% of the forest has been degraded from the 1990s until recently. However, as there are no official statistics, it is necessary to accurately identify the current situation for future restoration. Unlike general deforestation, North Korea was caused by a shortage of food and energy resources due to economic difficulties. Forests were cleared into fields for food supply, and extensive deforestation was accelerated by indiscriminate logging for use as an energy source due to a lack of coal. Although North Korea has recognized the problem of deforestation and implemented related policies, it has not been effectively implemented due to the continuing economic difficulties and deterioration of relations with South Korea. Since deforestation in North Korea has a socio-economic impact on North Korea and the environment on the Korean Peninsula and in Northeast Asia, restoration is urgently needed. In addition, it is important to know the exact current status and scale of deforestation for effective restoration project support when relations with Korea improve in the future. Since North Korea is currently inaccessible and it is impossible to determine the current situation through field surveys, remote sensing using satellite imagery is the most effective method. In addition, since deforestation is not a short-term phenomenon, but a long-term phenomenon, it is necessary to analyze it in multiple periods. Therefore, in this study, the status of deforestation in North Korea for 20 years from 2000 to 2020 after the 1990s, when deforestation in North Korea began to intensify, was identified, and two research hypotheses were established and confirmed. This study aims to enable it to be used as basic data for systematic planning when conducting a restoration project in the future. To this end, land cover classification is carried out using the pixel-based supervised classification random forest method through Google Earth Engine, a geographic information platform in the United States, and based on this, change detection is performed to determine the extent of devastation in an area. We looked at the progress and how much the forest area had changed. As a result of the analysis, the proportion of forests in North Korea decreased by about 11.5% from about 72.5% of the total area to about 61% from 2000 to 2010. On the other hand, the ratio of cropland and bareland increased by about 7% and about 2%, respectively, indicating that the deforestation caused by reckless logging and clearing is serious. The regions with the most changes were Pyeongan-do, Hamgyeong-do, and Gangwon-do, and the region with the least change was Hwanghae-do. During 2010-2020, the proportion of forests in North Korea increased by about 1% from about 61% to about 62%, and the cropland also increased by about 3%. When the full-scale forest restoration project began in North Korea, the ratio of bareland decreased by about 4% and the ratio of the forest increased slightly. Hwanghae-do and Gangwon-do, Hamgyeong-do showed the largest change, and Pyeongan-do show the least change. Gangwon-do, Hamgyeong-do, has seen many changes in common over the past 20 years, and the analysis results show that clearing and logging took place a lot in this area.Chapter 1. Introduction 2 1.1. Study Background and Purpose of Research 2 Chapter 2. Literature Review 5 2.1. Deforestation of North Korea 5 2.2. Random Forest using GEE 9 2.3. Change Detection 12 Chapter 3. Materials and Methods 14 3.1. Study Area and Materials 14 3.1.1. Study Area 14 3.1.2. Materials 15 3.2. Methods 21 3.2.1. Dataset and Pre-processing 21 3.2.2. Random Forest using GEE 22 3.2.3. Change Detection 23 Chapter 4. Results and Discussions 24 4.1. Results of Radom Forest 24 4.2. Results of Change Detection 28 4.2.1. 2000-2010 28 4.2.2. 2010-2020 30 4.2.3. 2000-2020 32 4.2.4. Regional Results 33 4.3. Discussions 46 Chapter 5. Conclusion 48 Bibliography 50 Abstract in Korean 55석

    Garantia de privacidade na exploração de bases de dados distribuídas

    Get PDF
    Anonymisation is currently one of the biggest challenges when sharing sensitive personal information. Its importance depends largely on the application domain, but when dealing with health information, this becomes a more serious issue. A simpler approach to avoid this disclosure is to ensure that all data that can be associated directly with an individual is removed from the original dataset. However, some studies have shown that simple anonymisation procedures can sometimes be reverted using specific patients’ characteristics, namely when the anonymisation is based on hidden key attributes. In this work, we propose a secure architecture to share information from distributed databases without compromising the subjects’ privacy. The work was initially focused on identifying techniques to link information between multiple data sources, in order to revert the anonymization procedures. In a second phase, we developed the methodology to perform queries over distributed databases was proposed. The architecture was validated using a standard data schema that is widely adopted in observational research studies.A garantia da anonimização de dados é atualmente um dos maiores desafios quando existe a necessidade de partilhar informações pessoais de carácter sensível. Apesar de ser um problema transversal a muitos domínios de aplicação, este torna-se mais crítico quando a anonimização envolve dados clinicos. Nestes casos, a abordagem mais comum para evitar a divulgação de dados, que possam ser associados diretamente a um indivíduo, consiste na remoção de atributos identificadores. No entanto, segundo a literatura, esta abordagem não oferece uma garantia total de anonimato, que pode ser quebrada através de ataques específicos que permitem a reidentificação dos sujeitos. Neste trabalho, é proposta uma arquitetura que permite partilhar dados armazenados em repositórios distribuídos, de forma segura e sem comprometer a privacidade. Numa primeira fase deste trabalho, foi feita uma análise de técnicas que permitam reverter os procedimentos de anonimização. Na fase seguinte, foi proposta uma metodologia que permite realizar pesquisas em bases de dados distribuídas, sem que o anonimato seja quebrado. Esta arquitetura foi validada sobre um esquema de base de dados relacional que é amplamente utilizado em estudos clínicos observacionais.Mestrado em Ciberseguranç

    Using clinical text to refine unspecific condition codes in Dutch general practitioner EHR data

    Get PDF
    Objective: Observational studies using electronic health record (EHR) databases often face challenges due to unspecific clinical codes that can obscure detailed medical information, hindering precise data analysis. In this study, we aimed to assess the feasibility of refining these unspecific condition codes into more specific codes in a Dutch general practitioner (GP) EHR database by leveraging the available clinical free text. Methods: We utilized three approaches for text classification—search queries, semi-supervised learning, and supervised learning—to improve the specificity of ten unspecific International Classification of Primary Care (ICPC-1) codes. Two text representations and three machine learning algorithms were evaluated for the (semi-)supervised models. Additionally, we measured the improvement achieved by the refinement process on all code occurrences in the database. Results: The classification models performed well for most codes. In general, no single classification approach consistently outperformed the others. However, there were variations in the relative performance of the classification approaches within each code and in the use of different text representations and machine learning algorithms. Class imbalance and limited training data affected the performance of the (semi-)supervised models, yet the simple search queries remained particularly effective. Ultimately, the developed models improved the specificity of over half of all the unspecific code occurrences in the database. Conclusions: Our findings show the feasibility of using information from clinical text to improve the specificity of unspecific condition codes in observational healthcare databases, even with a limited range of machine-learning techniques and modest annotated training sets. Future work could investigate transfer learning, integration of structured data, alternative semi-supervised methods, and validation of models across healthcare settings. The improved level of detail enriches the interpretation of medical information and can benefit observational research and patient care.</p
    corecore