44 research outputs found
Contribution to the Federation of the asynchronous SmartSantander service layer within the European Fed4FIRE context
This thesis is a contribution to the federation of asynchronous SmartSantander service layer within the European Fed4FIRE context. The thesis was developed in a Smart City background, and its main aims were both to gain knowledge of how Smart Cities, Testbeds and Federations of Testbeds are structured by working on a real deployed system, i.e. SmartSantander framework and Fed4FIRE federation, and to contribute with some of the components required for the integratio
Resolving horizontal partitioning and schematic variances using metadatabase approach.
by Poon, Koon-hei.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 80-83).Abstracts in English and Chinese.Chapter CHAPTER 1 --- INTRODUCTION --- p.6Chapter CHAPTER 2 --- LITERATURE REVIEW --- p.13Chapter 2.1. --- BACKGROUND --- p.13Chapter 2.2. --- example systems --- p.20Chapter 2.2.1 --- Multibase --- p.20Chapter 2.2.2. --- Mermai d --- p.23Chapter 2.2.3. --- The Metadatabase Approach --- p.26Chapter 2.3. --- SUMMARY --- p.29Chapter CHAPTER 3 --- THE METADATABASE APPROACH --- p.31Chapter 3.1. --- Two-Stage Entity Relationship (TSER) model --- p.31Chapter 3.2. --- The GIRD --- p.34Chapter 3.3. --- The Metadatabase system in action --- p.36Chapter 3.3. --- global query formulations and processing in the metadatabase system --- p.37Chapter CHAPTER 4 --- PROBLEM OUTLINES FOR HORIZONTAL PARTITIONING AND ITS VARIANTS --- p.39Chapter 4.1. --- Horizontal partitioning --- p.39Chapter 4.2. --- Level of abstraction --- p.41Chapter 4.3. --- Schematic variances --- p.42Chapter 4.4. --- Summary --- p.43Chapter 4.5. --- The Scenario --- p.44Chapter 4.6. --- Populating the Metadatabase --- p.48Chapter CHAPTER 5 --- THE ENHANCEMENTS FOR GLOBAL QUERY WITH HORIZONTAL PARTITIONED DATA OBJECTS --- p.51Chapter 5.1. --- Identifying partitioned data objects --- p.51Chapter 5.2. --- Additional metadata for the horizontal partitioned data objects --- p.52Chapter 5.3. --- Complications of horizontal partitioning problem --- p.54Chapter 5.3.1. --- Level of abstraction --- p.55Chapter 5.3.2. --- Schematic variances --- p.57Chapter 5.4. --- Global query with horizontal partitioning data objects --- p.59Chapter 5.5. --- Housing the new metadata --- p.68Chapter 5.6. --- Example --- p.72Chapter CHAPTER 6 --- ANALYSIS --- p.75Chapter CHAPTER 7 --- CONCLUSION AND FUTURE WORKS --- p.78REFERENCES --- p.80APPENDICES --- p.84Chapter A. --- GIRD Definitions --- p.84Chapter A1. --- GIRD Model --- p.84Chapter A2. --- GIRD/SER Contents --- p.84Chapter A3. --- GIRD/OER Constructs --- p.87Chapter A4. --- Definition of Meta-attributes --- p.89Chapter B. --- Problems Representations in Relation Algebra --- p.96Chapter B1. --- Horizontal problem --- p.96Chapter B2. --- Level of abstraction --- p.96Chapter B3. --- Schematic Variance --- p.97Chapter C. --- Details of local systems --- p.9
Enabling Complex Semantic Queries to Bioinformatics Databases through Intuitive Search Over Data
Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data already available publicly. However, the heterogene- ity of the existing data sources still poses significant challenges for achieving interoperability among biological databases. Furthermore, merely solving the technical challenges of data in- tegration, for example through the use of common data representation formats, leaves open the larger problem. Namely, the steep learning curve required for understanding the data models of each public source, as well as the technical language through which the sources can be queried and joined. As a consequence, most of the available biological data remain practically unexplored today.
In this thesis, we address these problems jointly, by first introducing an ontology-based data integration solution in order to mitigate the data source heterogeneity problem. We illustrate through the concrete example of Bgee, a gene expression data source, how relational databases can be exposed as virtual Resource Description Framework (RDF) graphs, through relational-to-RDF mappings. This has the important advantage that the original data source can remain unmodified, while still becoming interoperable with external RDF sources.
We complement our methods with applied case studies designed to guide domain experts in formulating expressive federated queries targeting the integrated data across the domains of evolutionary relationships and gene expression. More precisely, we introduce two com- parative analyses, first within the same domain (using orthology data from multiple, inter- operable, data sources) and second across domains, in order to study the relation between expression change and evolution rate following a duplication event.
Finally, in order to bridge the semantic gap between users and data, we design and im- plement Bio-SODA, a question answering system over domain knowledge graphs, that does not require training data for translating user questions to SPARQL. Bio-SODA uses a novel ranking approach that combines syntactic and semantic similarity, while also incorporating node centrality metrics to rank candidate matches for a given user question. Our results in testing Bio-SODA across several real-world databases that span multiple domains (both within and outside bioinformatics) show that it can answer complex, multi-fact queries, be- yond the current state-of-the-art in the more well-studied open-domain question answering.
--
LâintĂ©gration des donnĂ©es promet dâĂȘtre lâun des principaux catalyseurs permettant dâextraire des nouveaux aperçus de la richesse des donnĂ©es biologiques dĂ©jĂ disponibles publiquement. Cependant, lâhĂ©tĂ©rogĂ©nĂ©itĂ© des sources de donnĂ©es existantes pose encore des dĂ©fis importants pour parvenir Ă lâinteropĂ©rabilitĂ© des bases de donnĂ©es biologiques. De plus, en surmontant seulement les dĂ©fis techniques de lâintĂ©gration des donnĂ©es, par exemple grĂące Ă lâutilisation de formats standard de reprĂ©sentation de donnĂ©es, on laisse ouvert un problĂšme encore plus grand. Ă savoir, la courbe dâapprentissage abrupte nĂ©cessaire pour comprendre la modĂ©li- sation des donnĂ©es choisie par chaque source publique, ainsi que le langage technique par lequel les sources peuvent ĂȘtre interrogĂ©s et jointes. Par consĂ©quent, la plupart des donnĂ©es biologiques publiquement disponibles restent pratiquement inexplorĂ©s aujourdâhui.
Dans cette thĂšse, nous abordons lâensemble des deux problĂšmes, en introduisant dâabord une solution dâintĂ©gration de donnĂ©es basĂ©e sur ontologies, afin dâattĂ©nuer le problĂšme dâhĂ©tĂ©- rogĂ©nĂ©itĂ© des sources de donnĂ©es. Nous montrons, Ă travers lâexemple de Bgee, une base de donnĂ©es dâexpression de gĂšnes, une approche permettant les bases de donnĂ©es relationnelles dâĂȘtre publiĂ©s sous forme de graphes RDF (Resource Description Framework) virtuels, via des correspondances relationnel-vers-RDF (« relational-to-RDF mappings »). Cela prĂ©sente lâimportant avantage que la source de donnĂ©es dâorigine peut rester inchangĂ©, tout en de- venant interopĂ©rable avec les sources RDF externes.
Nous complĂ©tons nos mĂ©thodes avec des Ă©tudes de cas appliquĂ©es, conçues pour guider les experts du domaine dans la formulation de requĂȘtes fĂ©dĂ©rĂ©es expressives, ciblant les don- nĂ©es intĂ©grĂ©es dans les domaines des relations Ă©volutionnaires et de lâexpression des gĂšnes. Plus prĂ©cisĂ©ment, nous introduisons deux analyses comparatives, dâabord dans le mĂȘme do- maine (en utilisant des donnĂ©es dâorthologie provenant de plusieurs sources de donnĂ©es in- teropĂ©rables) et ensuite Ă travers des domaines interconnectĂ©s, afin dâĂ©tudier la relation entre le changement dâexpression et le taux dâĂ©volution suite Ă une duplication de gĂšne.
Enfin, afin de mitiger le dĂ©calage sĂ©mantique entre les utilisateurs et les donnĂ©es, nous concevons et implĂ©mentons Bio-SODA, un systĂšme de rĂ©ponse aux questions sur des graphes de connaissances domaine-spĂ©cifique, qui ne nĂ©cessite pas de donnĂ©es de formation pour traduire les questions des utilisateurs vers SPARQL. Bio-SODA utilise une nouvelle ap- proche de classement qui combine la similaritĂ© syntactique et sĂ©mantique, tout en incorporant des mĂ©triques de centralitĂ© des nĆuds, pour classer les possibles candidats en rĂ©ponse Ă une question utilisateur donnĂ©e. Nos rĂ©sultats suite aux tests effectuĂ©s en utilisant Bio-SODA sur plusieurs bases de donnĂ©es Ă travers plusieurs domaines (tantĂŽt liĂ©s Ă la bioinformatique quâextĂ©rieurs) montrent que Bio-SODA rĂ©ussit Ă rĂ©pondre Ă des questions complexes, en- gendrant multiples entitĂ©s, au-delĂ de lâĂ©tat actuel de la technique en matiĂšre de systĂšmes de rĂ©ponses aux questions sur les donnĂ©es structures, en particulier graphes de connaissances
Information retrieval and text mining technologies for chemistry
Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European
Communityâs Horizon 2020 Program (project reference:
654021 - OpenMinted). M.K. additionally acknowledges the
Encomienda MINETAD-CNIO as part of the Plan for the
Advancement of Language Technology. O.R. and J.O. thank
the Foundation for Applied Medical Research (FIMA),
University of Navarra (Pamplona, Spain). This work was
partially funded by ConselleriÌa
de Cultura, EducacioÌn e OrdenacioÌn Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic
funding of UID/BIO/04469/2013 unit and COMPETE 2020
(POCI-01-0145-FEDER-006684). We thank InÌigo GarciaÌ -Yoldi
for useful feedback and discussions during the preparation of
the manuscript.info:eu-repo/semantics/publishedVersio
Realizing interoperability of e-learning repositories
Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, marzo 200
The Nexus Between Security Sector Governance/Reform and Sustainable Development Goal-16
This Security Sector Reform (SSR) Paper offers a universal and analytical perspective on the linkages between Security Sector Governance (SSG)/SSR (SSG/R) and Sustainable Development Goal-16 (SDG-16), focusing on conflict and post-conflict settings as well as transitional and consolidated democracies. Against the background of development and security literatures traditionally maintaining separate and compartmentalized presence in both academic and policymaking circles, it maintains that the contemporary security- and development-related challenges are inextricably linked, requiring effective measures with an accurate understanding of the nature of these challenges. In that sense, SDG-16 is surely a good step in the right direction. After comparing and contrasting SSG/R and SDG-16, this SSR Paper argues that human security lies at the heart of the nexus between the 2030 Agenda of the United Nations (UN) and SSG/R. To do so, it first provides a brief overview of the scholarly and policymaking literature on the development-security nexus to set the background for the adoption of The Agenda 2030. Next, it reviews the literature on SSG/R and SDGs, and how each concept evolved over time. It then identifies the puzzle this study seeks to address by comparing and contrasting SSG/R with SDG-16. After making a case that human security lies at the heart of the nexus between the UNâs 2030 Agenda and SSG/R, this book analyses the strengths and weaknesses of human security as a bridge between SSG/R and SDG-16 and makes policy recommendations on how SSG/R, bolstered by human security, may help achieve better results on the SDG-16 targets. It specifically emphasizes the importance of transparency, oversight, and accountability on the one hand, and participative approach and local ownership on the other. It concludes by arguing that a simultaneous emphasis on security and development is sorely needed for addressing the issues under the purview of SDG-16
Big Data in Bioeconomy
This edited open access book presents the comprehensive outcome of The European DataBio Project, which examined new data-driven methods to shape a bioeconomy. These methods are used to develop new and sustainable ways to use forest, farm and fishery resources. As a European initiative, the goal is to use these new findings to support decision-makers and producers â meaning farmers, land and forest owners and fishermen. With their 27 pilot projects from 17 countries, the authors examine important sectors and highlight examples where modern data-driven methods were used to increase sustainability. How can farmers, foresters or fishermen use these insights in their daily lives? The authors answer this and other questions for our readers. The first four parts of this book give an overview of the big data technologies relevant for optimal raw material gathering. The next three parts put these technologies into perspective, by showing useable applications from farming, forestry and fishery. The final part of this book gives a summary and a view on the future. With its broad outlook and variety of topics, this book is an enrichment for students and scientists in bioeconomy, biodiversity and renewable resources
Steps towards interoperability in healthcare environment
Tese doutoramento - Programa Doutoral em Engenharia Biomédica, Informåtica MédicaHealthcare units have complex Information Systems (IS) made up from heterogeneous
data sources, which speak di erent languages and with di erent objectives.
Nevertheless, all these sources have indeed important information that can contribute
in an active way to provide a healthcare system of excellence. The evolution
that has been noticed in Health IS has promoted the development of new methodologies
and tools that are intended to solve this complicated problem. In this manner,
one of the main paradigms that arises is the interoperability among systems and its
capability to allow a general and simpli ed access to relevant information. Another
aspect that should be kept in mind, given the constrains of the global economic
situation, is the reduction in the investment in national healthcare systems. This
thesis is based on a set of studies performed at the Centro Hospitalar do T^amega
e Sousa (CHTS) in which the main goals are promoting an improvement in the
relation patient-hospital, having in consideration the reduction of implementation costs, but preserving the quality of information. The last one should be accessible
everywhere and at anytime to help with clinical decision and, in the future, be
available for clinical studies through data computationally interpretable. To do so,
an Electronic Semantic Health Record was formalized and implemented, with the
help of the clinical sta , which collects all the information considered important and
relevant. This Health Record was delivered through a platform for the distribution
and archive of clinical information, named Agency for the Integration, Di usion and
Archive (AIDA), which is supported by intelligent agents that treat data in an ex-haustive and structured way. To test the proposed model and system and in order
to strengthen the relation between the patient and the hospital, an appointment
alert system based on SMS and electronic mail was developed, which allowed the reduction
of non-programmed misses and that provided a decrease of costs by better
re-distributed appointment schedules, and allocate human resources and physical
spaces in a more e ective manner. Finally, to reduce stopping periods of systems
and to promote the user's con dence on Information Systems, an open-source tool
was developed that enables the scheduling of preventive actions according to a mathematical
model. These tools allowed for a continuous improvement of systems and
are currently well accepted by clinicians and Information Technologies (IT) specialists
inside the healthcare unit, proving in real clinical situation the e ectiveness and
usability of the model.As unidades de saĂșde possuem Sistemas de Informação (SI) complexos, compostos por fontes de dados heterogĂ©neas com objectivos distintos. Por em, toda a informação e importante e pode contribuir de forma ativa para a prestação de cuidados de saĂșde de excelĂȘncia. Com a evolução dos SI na SaĂșde novas metodologias tĂȘm sido desenvolvidas com o intuito de solucionar este problema complicado. Nesta perspectiva, um dos principais paradigmas que se coloca e a interoperabilidade entre sistemas e a sua capacidade para permitir um acesso simples a informação relevante. Outro factor relevante relaciona-se com os constrangimentos financeiros que toda a economia global atravessa e que se reflete numa diminuição no investimento nos servi cos nacionais de saĂșde. Esta tese tem como base um conjunto de estudos realizados no Centro Hospitalar do TĂąmega e Sousa cujos principais objetivos se prendem com um esforço orientado para a melhoria da relação paciente-hospital, tendo em conta a redução de custos de implementação, mas garantindo sobretudo a qualidade de informação. Esta dever a estar disponĂvel em qualquer lugar e a qualquer altura para o auxĂlio a decisĂŁo clinica e, em Ășltima instancia, disponĂvel para estudos cl nicos atravĂ©s de dados interpretĂĄveis computacionalmente. Para tal, recorreu-se a ajuda de pessoal clinico para a implementação de um Processo ClĂnico EletrĂłnico SemĂąntico que recolhe toda a informação considerada relevante. Este Processo ClĂnico foi potenciado atravĂ©s de uma plataforma para a distribuição e arquivo de informação clinica, denominada de Agencia para a Interoperação, DifusĂŁo e Arquivo (AIDA), baseada em agentes inteligentes que tratam os dados de forma estruturada. Para testar o modelo e de forma a fortalecer a relação paciente-hospital foi desenvolvido um sistema de alertas para consulta via mensagens escritas e e-mail, que diminuiu o numero de faltas nĂŁo programadas, proporcionando uma redução de custos atravĂ©s de uma redistribuição dos tempos de consulta alocando recursos humanos e fĂsicos de forma mais eficaz. Por fim, com vista a redução dos tempos de paragem de sistemas, e potenciar a confiança dos utilizadores nos mesmos, foi desenvolvida uma ferramenta baseada em tecnologia open-source que permite o agendamento de intervençÔes preventivas de acordo com um modelo matemĂĄtico. Esta ferramenta proporcionou uma melhoria contĂnua dos sistemas e estĂĄ globalmente aceite por cl nicos e especialistas de Tecnologias de Informação (TI), provando em situaçÔes clĂnicas reais a usabilidade e eficĂĄcia do modelo