222 research outputs found

    Cross-Lingual Entity Matching for Knowledge Graphs

    Get PDF
    Multilingual knowledge graphs (KGs), such as YAGO and DBpedia, represent entities in different languages. The task of cross-lingual entity matching is to align entities in a source language with their counterparts in target languages. In this thesis, we investigate embedding-based approaches to encode entities from multilingual KGs into the same vector space, where equivalent entities are close to each other. Specifically, we apply graph convolutional networks (GCNs) to combine multi-aspect information of entities, including topological connections, relations, and attributes of entities, to learn entity embeddings. To exploit the literal descriptions of entities expressed in different languages, we propose two uses of a pre-trained multilingual BERT model to bridge cross-lingual gaps. We further propose two strategies to integrate GCN-based and BERT-based modules to boost performance. Extensive experiments on two benchmark datasets demonstrate that our method significantly outperforms existing systems. We additionally introduce a new dataset comprised of 15 low-resource languages and featured with unlinkable cases to draw closer to the real-world challenges

    Prototype of a Conversational Assistant for Satellite Mission Operations

    Get PDF
    The very first artificial satellite, Sputnik, was launched in 1957 marking a new era. Concurrently, satellite mission operations emerged. These start at launch and finish at the end of mission, when the spacecraft is decommissioned. Running a satellite mission requires the monitoring and control of telemetry data, to verify and maintain satellite health, reconfigure and command the spacecraft, detect, identify and resolve anomalies and perform launch and early orbit operations. The very first chatbot, ELIZA was created in 1966, and also marked a new era of Artificial Intelligence Systems. Said systems answer users’ questions in the most diverse domains, interpreting the human language input and responding in the same manner. Nowadays, these systems are everywhere, and the list of possible applications seems endless. The goal of the present master’s dissertation is to develop a prototype of a chatbot for mission operations. For this purpose implementing a Natural Language Processing (NLP) model for satellite missions allied to a dialogue flow model. The performance of the conversational assistant is evaluated with its implementation on a mission operated by the European Space Agency (ESA), implying the generation of the spacecraft’s Database Knowledge Graph (KG). Throughout the years, many tools have been developed and added to the systems used to monitor and control spacecrafts helping Flight Control Teams (FCT) either by maintaining a comprehensive overview of the spacecraft’s status and health, speeding up failure investigation, or allowing to easily correlate time series of telemetry data. However, despite all the advances made which facilitate the daily tasks, the teams still need to navigate through thousands of parameters and events spanning years of data, using purposely built user interfaces and relying on filters and time series plots. The solution presented in this dissertation and proposed by VisionSpace Technologies focuses on improving operational efficiency whilst dealing with the mission’s complex and extensive databases.O primeiro satélite artificial, Sputnik, foi lançado em 1957 e marcou o início de uma nova era. Simultaneamente, surgiram as operações de missão de satélites. Estas iniciam com o lançamento e terminam com desmantelamento do veículo espacial, que marca o fim da missão. A operação de satélites exige o acompanhamento e controlo de dados de telemetria, com o intuito de verificar e manter a saúde do satélite, reconfigurar e comandar o veículo, detetar, identificar e resolver anomalias e realizar o lançamento e as operações iniciais do satélite. Em 1966, o primeiro Chatbot foi criado, ELIZA, e também marcou uma nova era, de sistemas dotados de Inteligência Artificial. Tais sistemas respondem a perguntas nos mais diversos domínios, para tal interpretando linguagem humana e repondendo de forma similar. Hoje em dia, é muito comum encontrar estes sistemas e a lista de aplicações possíveis parece infindável. O objetivo da presente dissertação de mestrado consiste em desenvolver o protótipo de um Chatbot para operação de satélites. Para este proposito, criando um modelo de Processamento de Linguagem Natural (NLP) aplicado a missoões de satélites aliado a um modelo de fluxo de diálogo. O desempenho do assistente conversacional será avaliado com a sua implementação numa missão operada pela Agência Espacial Europeia (ESA), o que implica a elaboração do grafico de conhecimentos associado à base de dados da missão. Ao longo dos anos, várias ferramentas foram desenvolvidas e adicionadas aos sistemas que acompanham e controlam veículos espaciais, que colaboram com as equipas de controlo de missão, mantendo uma visão abrangente sobre a condição do satélite, acelerando a investigação de falhas, ou permitindo correlacionar séries temporais de dados de telemetria. No entanto, apesar de todos os progressos que facilitam as tarefas diárias, as equipas ainda necessitam de navegar por milhares de parametros e eventos que abrangem vários anos de recolha de dados, usando interfaces para esse fim e dependendo da utilização de filtros e gráficos de series temporais. A solução apresentada nesta dissertação e proposta pela VisionSpace Technologies tem como foco melhorar a eficiência operacional lidando simultaneamente com as suas complexas e extensas bases de dados

    Machine learning with limited label availability: algorithms and applications

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Efficient Neural Methods for Coreference Resolution

    Get PDF
    Coreference resolution is a core task in natural language processing and in creating language technologies. Neural methods and models for automatically resolving references have emerged and developed over the last several years. This progress is largely marked by continuous improvements on a single dataset and metric. In this thesis, the assumptions that underlie these improvements are shown to be unrealistic for real-world use due to the computational and data tradeoffs made to achieve apparently high performance. The thesis outlines and proposes solutions to three issues. First, to address the growing memory requirements and restrictions on input document length, a novel, constant memory neural model for coreference resolution is proposed and shown to attain performance comparable to contemporary models. Second, to address the failure of these models to generalize across datasets, continued training is evaluated and shown to be successful for transferring coreference resolution models between domains and languages. Finally, to combat the gains obtained via the use of increasingly large pretrained language models, multitask model pruning can be applied to maintain a single (small) model for multiple datasets. These methods reduce the computational cost of running a model and the annotation cost of creating a model for any arbitrary dataset. As real-world applications continue to demand resolution of coreference, methods that reduce the technical cost of training new models and making predictions are greatly desired, which this thesis addresses

    EXplainable Artificial Intelligence: enabling AI in neurosciences and beyond

    Get PDF
    The adoption of AI models in medicine and neurosciences has the potential to play a significant role not only in bringing scientific advancements but also in clinical decision-making. However, concerns mounts due to the eventual biases AI could have which could result in far-reaching consequences particularly in a critical field like biomedicine. It is challenging to achieve usable intelligence because not only it is fundamental to learn from prior data, extract knowledge and guarantee generalization capabilities, but also to disentangle the underlying explanatory factors in order to deeply understand the variables leading to the final decisions. There hence has been a call for approaches to open the AI `black box' to increase trust and reliability on the decision-making capabilities of AI algorithms. Such approaches are commonly referred to as XAI and are starting to be applied in medical fields even if not yet fully exploited. With this thesis we aim at contributing to enabling the use of AI in medicine and neurosciences by taking two fundamental steps: (i) practically pervade AI models with XAI (ii) Strongly validate XAI models. The first step was achieved on one hand by focusing on XAI taxonomy and proposing some guidelines specific for the AI and XAI applications in the neuroscience domain. On the other hand, we faced concrete issues proposing XAI solutions to decode the brain modulations in neurodegeneration relying on the morphological, microstructural and functional changes occurring at different disease stages as well as their connections with the genotype substrate. The second step was as well achieved by firstly defining four attributes related to XAI validation, namely stability, consistency, understandability and plausibility. Each attribute refers to a different aspect of XAI ranging from the assessment of explanations stability across different XAI methods, or highly collinear inputs, to the alignment of the obtained explanations with the state-of-the-art literature. We then proposed different validation techniques aiming at practically fulfilling such requirements. With this thesis, we contributed to the advancement of the research into XAI aiming at increasing awareness and critical use of AI methods opening the way to real-life applications enabling the development of personalized medicine and treatment by taking a data-driven and objective approach to healthcare

    Modelo de acesso a fontes em linguagem natural no governo electrónico

    Get PDF
    Doutoramento em Engenharia InformáticaFor the actual existence of e-government it is necessary and crucial to provide public information and documentation, making its access simple to citizens. A portion, not necessarily small, of these documents is in an unstructured form and in natural language, and consequently outside of which the current search systems are generally able to cope and effectively handle. Thus, in thesis, it is possible to improve access to these contents using systems that process natural language and create structured information, particularly if supported in semantics. In order to put this thesis to test, this work was developed in three major phases: (1) design of a conceptual model integrating the creation of structured information and making it available to various actors, in line with the vision of e-government 2.0; (2) definition and development of a prototype instantiating the key modules of this conceptual model, including ontology based information extraction supported by examples of relevant information, knowledge management and access based on natural language; (3) assessment of the usability and acceptability of querying information as made possible by the prototype - and in consequence of the conceptual model - by users in a realistic scenario, that included comparison with existing forms of access. In addition to this evaluation, at another level more related to technology assessment and not to the model, evaluations were made on the performance of the subsystem responsible for information extraction. The evaluation results show that the proposed model was perceived as more effective and useful than the alternatives. Associated with the performance of the prototype to extract information from documents, comparable to the state of the art, results demonstrate the feasibility and advantages, with current technology, of using natural language processing and integration of semantic information to improve access to unstructured contents in natural language. The conceptual model and the prototype demonstrator intend to contribute to the future existence of more sophisticated search systems that are also more suitable for e-government. To have transparency in governance, active citizenship, greater agility in the interaction with the public administration, among others, it is necessary that citizens and businesses have quick and easy access to official information, even if it was originally created in natural language.Para a efectiva existência de governo electrónico é necessário e crucial a disponibilização de informação e documentação pública e tornar simples o acesso a esta pelos cidadãos. Uma parte, não necessariamente pequena, destes documentos encontra-se sob uma forma não estruturada e em linguagem natural e, consequentemente, fora do que os sistemas de pesquisa actuais conseguem em geral suportar e disponibilizar eficazmente. Assim, em tese, é possível melhorar o acesso a estes conteúdos com recurso a sistemas que processem linguagem natural e que sejam capazes de criar informação estruturada, em especial se suportados numa semântica. Com o objectivo de colocar esta tese à prova, o desenvolvimento deste trabalho integrou três grandes fases ou vertentes: (1) Criação de um modelo conceptual integrando a criação de informação estruturada e a sua disponibilização para vários actores, alinhado com a visão do governo electrónico 2.0; (2) Definição e desenvolvimento de um protótipo instanciando os módulos essenciais deste modelo conceptual, nomeadamente a extracção de informação suportada em ontologias e exemplos de informação relevante, gestão de conhecimento e acesso baseado em linguagem natural; (3) Uma avaliação de usabilidade e aceitabilidade da consulta à informação tornada possível pelo protótipo – e em consequência do modelo conceptual - por utilizadores num cenário realista e que incluiu comparação com formas de acesso existentes. Além desta avaliação, a outro nível, mais relacionado com avaliação de tecnologias e não do modelo, foram efectuadas avaliações do desempenho do subsistema responsável pela extracção de informação. Os resultados da avaliação mostram que o modelo proposto foi percepcionado como mais eficaz e mais útil que as alternativas. Associado ao desempenho do protótipo a extrair informação dos documentos, comparável com o estado da arte, os resultados obtidos mostram a viabilidade e as vantagens, com a tecnologia actual, de utilizar processamento de linguagem natural e integração de informação semântica para melhorar acesso a conteúdos em linguagem natural e não estruturados. O modelo conceptual e o protótipo demonstrador pretendem contribuir para a existência futura de sistemas de pesquisa mais sofisticados e adequados ao governo electrónico. Para existir transparência na governação, cidadania activa, maior agilidade na interacção com a administração pública, entre outros, é necessário que cidadãos e empresas tenham acesso rápido e fácil a informação oficial, mesmo que ela tenha sido originalmente criada em linguagem natural

    The Analysis of Open Source Software and Data for Establishment of GIS Services Throughout the Network in a Mapping Organization at National or International Level

    Get PDF
    Federal agencies and their partners collect and manage large amounts of geospatial data but it is often not easily found when needed, and sometimes data is collected or purchased multiple times. In short, the best government data is not always organized and managed efficiently to support decision making in a timely and cost effective manner. National mapping agencies, various Departments responsible for collection of different types of Geospatial data and their authorities cannot, for very long, continue to operate, as they did a few years ago like people living in an island. Leaders need to look at what is now possible that was not possible before, considering capabilities such as cloud computing, crowd sourced data collection, available Open source remotely sensed data and multi source information vital in decision-making as well as new Web-accessible services that provide, sometimes at no cost. Many of these services previously could be obtained only from local GIS experts. These authorities need to consider the available solution and gather information about new capabilities, reconsider agency missions and goals, review and revise policies, make budget and human resource for decisions, and evaluate new products, cloud services, and cloud service providers. To do so, we need, choosing the right tools to rich the above-mentioned goals. As we know, Data collection is the most cost effective part of the mapping and establishment of a Geographic Information system. However, it is not only because of the cost for the data collection task but also because of the damages caused by the delay and the time that takes to provide the user with proper information necessary for making decision from the field up to the user’s hand. In fact, the time consumption of a project for data collection, processing, and presentation of geospatial information has more effect on the cost of a bigger project such as disaster management, construction, city planning, environment, etc. Of course, with such a pre-assumption that we provide all the necessary information from the existing sources directed to user’s computer. The best description for a good GIS project optimization or improvement is finding a methodology to reduce the time and cost, and increase data and service quality (meaning; Accuracy, updateness, completeness, consistency, suitability, information content, integrity, integration capability, and fitness for use as well as user’s specific needs and conditions that must be addressed with a special attention). Every one of the above-mentioned issues must be addressed individually and at the same time, the whole solution must be provided in a global manner considering all the criteria. In this thesis at first, we will discuss about the problem we are facing and what is needed to be done as establishment of National Spatial Data Infra-Structure (NSDI), the definition and related components. Then after, we will be looking for available Open Source Software solutions to cover the whole process to manage; Data collection, Data base management system, data processing and finally data services and presentation. The first distinction among Software is whether they are, Open source and free or commercial and proprietary. It is important to note that in order to make distinction among softwares it is necessary to define a clear specification for this categorization. It is somehow very difficult to distinguish what software belongs to which class from legal point of view and therefore, makes it necessary to clarify what is meant by various terms. With reference to this concept there are 2 global distinctions then, inside each group, we distinguish another classification regarding their functionalities and applications they are made for in GIScience. According to the outcome of the second chapter, which is the technical process for selection of suitable and reliable software according to the characteristics of the users need and required components, we will come to next chapter. In chapter 3, we elaborate in to the details of the GeoNode software as our best candidate tools to take responsibilities of those issues stated before. In Chapter 4, we will discuss the existing Open Source Data globally available with the predefined data quality criteria (Such as theme, data content, scale, licensing, and coverage) according to the metadata statement inside the datasets by mean of bibliographic review, technical documentation and web search engines. We will discuss in chapter 5 further data quality concepts and consequently define sets of protocol for evaluation of all datasets according to the tasks that a mapping organization in general, needed to be responsible to the probable users in different disciplines such as; Reconnaissance, City Planning, Topographic mapping, Transportation, Environment control, disaster management and etc… In Chapter 6, all the data quality assessment and protocols will be implemented into the pre-filtered, proposed datasets. In the final scores and ranking result, each datasets will have a value corresponding to their quality according to the sets of rules that are defined in previous chapter. In last steps, there will be a vector of weight that is derived from the questions that has to be answered by user with reference to the project in hand in order to finalize the most appropriate selection of Free and Open Source Data. This Data quality preference has to be defined by identifying a set of weight vector, and then they have to be applied to the quality matrix in order to get a final quality scores and ranking. At the end of this chapter there will be a section presenting data sets utilization in various projects such as “ Early Impact Analysis” as well as “Extreme Rainfall Detection System (ERDS)- version 2” performed by ITHACA. Finally, in conclusion, the important criteria, as well as future trend in GIS software are discussed and at the end recommendations will be presented

    Image-Based Hybrid Scaffold Design for Multiple Tissue Regeneration Application in Periodontal Engineering.

    Full text link
    Periodontal disease is a common chronic inflammatory disease, which if left untreated, can cause periodontal tissue breakdown. The periodontal complex is a micron-scaled, tooth-supporting structure with a complicated topology, which makes it difficult to predict and quantify periodontal tissue destruction. Unlike conventional assessment methods, 3-D micro-computed tomography provides very accurate, precise high resolution images of the periodontal topology. Using natural spatiotemporal landmarks to create a region-of-interest from the roof-of-furcation to the root-apex, volumetric image analysis of the bone-tooth interface was performed. The results demonstrated excellent examiner reproducibility and reliability (ICC>0.99 and CV<1.5%) for both linear and volumetric bone parameters. In an orthodontic tooth movement study, micro-CT quantified the activity of osteoprotegerin stimulation to prevent bone resorption and tooth mobility. Human alveolar bone core biopsies were analyzed to obtain mineral tissue density profiles in order to predict dental implant stability. Because of this high reproducibility and reliability, other wide-reaching applications have potential for predicting periodontal therapy outcomes, orthodontic tooth movement, as well as evaluation of clinical dental implant stability. A major challenge in periodontal tissue engineering is the control of periodontal tissue neogenesis; micron-scaled and complicated multi-interface regeneration with a functional architecture. To promote this compartmentalized, multiple tissue regeneration with perpendicularly-oriented periodontal ligament fiber, a multi-layered hybrid scaffold was designed and manufactured using the rapid prototyping technique. To produce a periodontium-like environment, the polymeric hybrid scaffold was assembled with a periodontal cell/tissue guidable micro-architecture; a highly porous bone region, a vertically-oriented PDL architecture, and a human tooth dentin slice. This complex was subcutaneously transplanted with untreated human PDL cells and BMP-7 transduced human gingival fibroblast cells using the ectopic model system. In spite of non-biomechanical loading conditions, this approach resulted in periodontal-structural similarity. There was a perpendicular/oblique orientation of the fibrous connective PDL cells/tissues to the dentin surface, and mineralized tissue formation without any mineralized tissue formation in the PDL interface of the hybrid scaffold at both the 3 and 6 weeks. This dissertation study provides potential for functional restoration of tissue interface neogenesis applications and shows promise for both pre-clinical and clinical applications for translational regenerative medicine.Ph.D.Biomedical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78755/1/chanho_1.pd
    corecore