24 research outputs found
Aerosols Monitored by Satellite Remote Sensing
Aerosols, small particles suspended in the atmosphere, affect the air quality and climate change. Their distributions can be monitored by satellite remote sensing. Many images of aerosol properties are available from websites as the by-products of the atmospheric correction of the satellite data. Their qualities depend on the accuracy of the atmospheric correction algorithms. The approaches of the atmospheric correction for land and ocean are different due to the large difference of the ground reflectance between land and ocean. A unified atmospheric correction (UAC) approach is developed to improve the accuracy of aerosol products over land, similar to those over ocean. This approach is developed to estimate the aerosol scattering reflectance from satellite data based on a lookup table (LUT) of in situ measured ground reflectance. The results show that the aerosol scattering reflectance can be completely separated from the satellite measured radiance over turbid waters and lands. The accuracy is validated with the mean relative errors of 22.1%. The vertical structures of the aerosols provide a new insight into the role of aerosols in regulating Earth\u27s weather, climate, and air quality
DTIPrep: quality control of diffusion-weighted images
pre-printIn the last decade, diffusion MRI (dMRI) studies of the human and animal brain have been used to investigate a multitude of pathologies and drug-related effects in neuroscience research. Study after study identifies white matter (WM) degeneration as a crucial biomaker for all these diseases. The tool of choice for studying WM is dMRI however, dMRI has inherently low signal-to-noise ratio and its acquisition requires a relatively long scan time; in fact, the high loads required occasionally stress scanner hardware past the point of physical failure
Completing and Debugging Ontologies: state of the art and challenges
As semantically-enabled applications require high-quality ontologies,
developing and maintaining ontologies that are as correct and complete as
possible is an important although difficult task in ontology engineering. A key
step is ontology debugging and completion. In general, there are two steps:
detecting defects and repairing defects. In this paper we discuss the state of
the art regarding the repairing step. We do this by formalizing the repairing
step as an abduction problem and situating the state of the art with respect to
this framework. We show that there are still many open research problems and
show opportunities for further work and advancing the field.Comment: 56 page
Algorithmes passant à l’échelle pour la gestion de données du Web sémantique sur les platformes cloud
In order to build smart systems, where machines are able to reason exactly like humans, data with semantics is a major requirement. This need led to the advent of the Semantic Web, proposing standard ways for representing and querying data with semantics. RDF is the prevalent data model used to describe web resources, and SPARQL is the query language that allows expressing queries over RDF data. Being able to store and query data with semantics triggered the development of many RDF data management systems. The rapid evolution of the Semantic Web provoked the shift from centralized data management systems to distributed ones. The first systems to appear relied on P2P and client-server architectures, while recently the focus moved to cloud computing.Cloud computing environments have strongly impacted research and development in distributed software platforms. Cloud providers offer distributed, shared-nothing infrastructures that may be used for data storage and processing. The main features of cloud computing involve scalability, fault-tolerance, and elastic allocation of computing and storage resources following the needs of the users.This thesis investigates the design and implementation of scalable algorithms and systems for cloud-based Semantic Web data management. In particular, we study the performance and cost of exploiting commercial cloud infrastructures to build Semantic Web data repositories, and the optimization of SPARQL queries for massively parallel frameworks.First, we introduce the basic concepts around Semantic Web and the main components and frameworks interacting in massively parallel cloud-based systems. In addition, we provide an extended overview of existing RDF data management systems in the centralized and distributed settings, emphasizing on the critical concepts of storage, indexing, query optimization, and infrastructure. Second, we present AMADA, an architecture for RDF data management using public cloud infrastructures. We follow the Software as a Service (SaaS) model, where the complete platform is running in the cloud and appropriate APIs are provided to the end-users for storing and retrieving RDF data. We explore various storage and querying strategies revealing pros and cons with respect to performance and also to monetary cost, which is a important new dimension to consider in public cloud services. Finally, we present CliqueSquare, a distributed RDF data management system built on top of Hadoop, incorporating a novel optimization algorithm that is able to produce massively parallel plans for SPARQL queries. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. Inspired by existing partitioning and indexing techniques we present a generic storage strategy suitable for storing RDF data in HDFS (Hadoop’s Distributed File System). Our experimental results validate the efficiency and effectiveness of the optimization algorithm demonstrating also the overall performance of the system.Afin de construire des systèmes intelligents, où les machines sont capables de raisonner exactement comme les humains, les données avec sémantique sont une exigence majeure. Ce besoin a conduit à l’apparition du Web sémantique, qui propose des technologies standards pour représenter et interroger les données avec sémantique. RDF est le modèle répandu destiné à décrire de façon formelle les ressources Web, et SPARQL est le langage de requête qui permet de rechercher, d’ajouter, de modifier ou de supprimer des données RDF. Être capable de stocker et de rechercher des données avec sémantique a engendré le développement des nombreux systèmes de gestion des données RDF.L’évolution rapide du Web sémantique a provoqué le passage de systèmes de gestion des données centralisées à ceux distribués. Les premiers systèmes étaient fondés sur les architectures pair-à-pair et client-serveur, alors que récemment l’attention se porte sur le cloud computing.Les environnements de cloud computing ont fortement impacté la recherche et développement dans les systèmes distribués. Les fournisseurs de cloud offrent des infrastructures distribuées autonomes pouvant être utilisées pour le stockage et le traitement des données. Les principales caractéristiques du cloud computing impliquent l’évolutivité́, la tolérance aux pannes et l’allocation élastique des ressources informatiques et de stockage en fonction des besoins des utilisateurs.Cette thèse étudie la conception et la mise en œuvre d’algorithmes et de systèmes passant à l’échelle pour la gestion des données du Web sémantique sur des platformes cloud. Plus particulièrement, nous étudions la performance et le coût d’exploitation des services de cloud computing pour construire des entrepôts de données du Web sémantique, ainsi que l’optimisation de requêtes SPARQL pour les cadres massivement parallèles.Tout d’abord, nous introduisons les concepts de base concernant le Web sémantique et les principaux composants des systèmes fondés sur le cloud. En outre, nous présentons un aperçu des systèmes de gestion des données RDF (centralisés et distribués), en mettant l’accent sur les concepts critiques de stockage, d’indexation, d’optimisation des requêtes et d’infrastructure.Ensuite, nous présentons AMADA, une architecture de gestion de données RDF utilisant les infrastructures de cloud public. Nous adoptons le modèle de logiciel en tant que service (software as a service - SaaS), où la plateforme réside dans le cloud et des APIs appropriées sont mises à disposition des utilisateurs, afin qu’ils soient capables de stocker et de récupérer des données RDF. Nous explorons diverses stratégies de stockage et d’interrogation, et nous étudions leurs avantages et inconvénients au regard de la performance et du coût monétaire, qui est une nouvelle dimension importante à considérer dans les services de cloud public.Enfin, nous présentons CliqueSquare, un système distribué de gestion des données RDF basé sur Hadoop. CliqueSquare intègre un nouvel algorithme d’optimisation qui est capable de produire des plans massivement parallèles pour des requêtes SPARQL. Nous présentons une famille d’algorithmes d’optimisation, s’appuyant sur les équijointures n- aires pour générer des plans plats, et nous comparons leur capacité à trouver les plans les plus plats possibles. Inspirés par des techniques de partitionnement et d’indexation existantes, nous présentons une stratégie de stockage générique appropriée au stockage de données RDF dans HDFS (Hadoop Distributed File System). Nos résultats expérimentaux valident l’effectivité et l’efficacité de l’algorithme d’optimisation démontrant également la performance globale du système
Adapting IT Governance Frameworks using Domain Specific Requirements Methods: Examples from Small & Medium Enterprises and Emergency Management
IT Governance methods and frameworks have been applied in most large for-profit organizations since these enterprisesrealize the benefits of IT Governance for their business. However, former research and our own surveys show thatframeworks such as ITIL and COBIT are not very well established in Small and Medium Enterprises (SME) as well as inEmergency Management (EM) organizations. Thus, we investigated what kind of barriers can be the cause for the lowadoption rate. These results built the basis for our Domain Specific Engineering (DSE) approach. The research is based onthe data of two research projects. The first project investigated the utilization of ITSM methods in European SMEs, and thesecond has researched different emergency management organizations. This paper defines similarities and differences of thetwo domain specific solutions, describes the engineering approach, and gives guidelines for further research in otherdomains
AspectMMKG: A Multi-modal Knowledge Graph with Aspect-aware Entities
Multi-modal knowledge graphs (MMKGs) combine different modal data (e.g., text
and image) for a comprehensive understanding of entities. Despite the recent
progress of large-scale MMKGs, existing MMKGs neglect the multi-aspect nature
of entities, limiting the ability to comprehend entities from various
perspectives. In this paper, we construct AspectMMKG, the first MMKG with
aspect-related images by matching images to different entity aspects.
Specifically, we collect aspect-related images from a knowledge base, and
further extract aspect-related sentences from the knowledge base as queries to
retrieve a large number of aspect-related images via an online image search
engine. Finally, AspectMMKG contains 2,380 entities, 18,139 entity aspects, and
645,383 aspect-related images. We demonstrate the usability of AspectMMKG in
entity aspect linking (EAL) downstream task and show that previous EAL models
achieve a new state-of-the-art performance with the help of AspectMMKG. To
facilitate the research on aspect-related MMKG, we further propose an
aspect-related image retrieval (AIR) model, that aims to correct and expand
aspect-related images in AspectMMKG. We train an AIR model to learn the
relationship between entity image and entity aspect-related images by
incorporating entity image, aspect, and aspect image information. Experimental
results indicate that the AIR model could retrieve suitable images for a given
entity w.r.t different aspects.Comment: Accepted by CIKM 202
Recommended from our members
Results of the ontology alignment evaluation initiative 2019
The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity (from simple thesauri to expressive OWL ontologies) and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus). The OAEI 2019 campaign offered 11 tracks with 29 test cases, and was attended by 20 participants. This paper is an overall presentation of that campaign
Breaking rules: taking Complex Ontology Alignment beyond rulebased approaches
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2021As ontologies are developed in an uncoordinated manner, differences in scope and design compromise interoperability. Ontology matching is critical to address this semantic heterogeneity problem, as it finds correspondences that enable integrating data across the Semantic Web. One of the biggest challenges in this field is that ontology schemas often differ conceptually, and therefore reconciling many real¬world ontology pairs (e.g., in geography or biomedicine) involves establishing complex mappings that contain multiple entities from each ontology. Yet, for the most part, ontology matching algorithms are restricted to finding simple equivalence mappings between ontology entities. This work presents novel algorithms for Complex Ontology Alignment based on Association Rule Mining over a set of shared instances between two ontologies. Its strategy relies on a targeted search for known complex patterns in instance and schema data, reducing the search space. This allows the application of semantic¬based filtering algorithms tailored to each kind of pattern, to select and refine the most relevant mappings. The algorithms were evaluated in OAEI Complex track datasets under two automated approaches: OAEI’s entity¬based approach and a novel element¬overlap–based approach which was developed in the context of this work. The algorithms were able to find mappings spanning eight distinct complex patterns, as well as combinations of patterns through disjunction and conjunction. They were able to efficiently reduce the search space and showed competitive performance results comparing to the State of the Art of complex alignment systems. As for the comparative analysis of evaluation methodologies, the proposed element¬overlap–based evaluation strategy was shown to be more accurate and interpretable than the reference-based automatic alternative, although none of the existing strategies fully address the challenges discussed in the literature. For future work, it would be interesting to extend the algorithms to cover more complex patterns and combine them with lexical approaches