2,665 research outputs found
Towards Dynamic Composition of Question Answering Pipelines
Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of question answering for user interaction. DBpedia has been the most prominently used knowledge graph in this setting. QA systems implement a pipeline connecting a sequence of QA components for translating an input question into its corresponding formal query (e.g. SPARQL); this query will be executed over a knowledge graph in order to produce the answer of the question. Recent empirical studies have revealed that albeit overall effective, the performance of QA systems and QA components depends heavily on the features of input questions, and not even the combination of the best performing QA systems or individual QA components retrieves complete and correct answers. Furthermore, these QA systems cannot be easily reused, extended, and results cannot be easily reproduced since the systems are mostly implemented in a monolithic fashion, lack standardised interfaces and are often not open source or available as Web services. All these drawbacks of the state of the art that prevents many of these approaches to be employed in real-world applications. In this thesis, we tackle the problem of QA over knowledge graph and propose a generic approach to promote reusability and build question answering systems in a collaborative effort. Firstly, we define qa vocabulary and Qanary methodology to develop an abstraction level on existing QA systems and components. Qanary relies on qa vocabulary to establish guidelines for semantically describing the knowledge exchange between the components of a QA system. We implement a component-based modular framework called "Qanary Ecosystem" utilising the Qanary methodology to integrate several heterogeneous QA components in a single platform. We further present Qaestro framework that provides an approach to semantically describing question answering components and effectively enumerates QA pipelines based on a QA developer requirements. Qaestro provides all valid combinations of available QA components respecting the input-output requirement of each component to build QA pipelines. Finally, we address the scalability of QA components within a framework and propose a novel approach that chooses the best component per task to automatically build QA pipeline for each input question. We implement this model within FRANKENSTEIN, a framework able to select QA components and compose pipelines. FRANKENSTEIN extends Qanary ecosystem and utilises qa vocabulary for data exchange. It has 29 independent QA components implementing five QA tasks resulting 360 unique QA pipelines. Each approach proposed in this thesis (Qanary methodology, Qaestro, and FRANKENSTEIN) is supported by extensive evaluation to demonstrate their effectiveness. Our contributions target a broader research agenda of offering the QA community an efficient way of applying their research to a research field which is driven by many different fields, consequently requiring a collaborative approach to achieve significant progress in the domain of question answering
A semantic and agent-based approach to support information retrieval, interoperability and multi-lateral viewpoints for heterogeneous environmental databases
PhDData stored in individual autonomous databases often needs to be combined and
interrelated. For example, in the Inland Water (IW) environment monitoring domain,
the spatial and temporal variation of measurements of different water quality indicators
stored in different databases are of interest. Data from multiple data sources is more
complex to combine when there is a lack of metadata in a computation forin and when
the syntax and semantics of the stored data models are heterogeneous. The main types
of information retrieval (IR) requirements are query transparency and data
harmonisation for data interoperability and support for multiple user views. A
combined Semantic Web based and Agent based distributed system framework has
been developed to support the above IR requirements. It has been implemented using
the Jena ontology and JADE agent toolkits. The semantic part supports the
interoperability of autonomous data sources by merging their intensional data, using a
Global-As-View or GAV approach, into a global semantic model, represented in
DAML+OIL and in OWL. This is used to mediate between different local database
views. The agent part provides the semantic services to import, align and parse
semantic metadata instances, to support data mediation and to reason about data
mappings during alignment. The framework has applied to support information
retrieval, interoperability and multi-lateral viewpoints for four European environmental
agency databases.
An extended GAV approach has been developed and applied to handle queries that can
be reformulated over multiple user views of the stored data. This allows users to
retrieve data in a conceptualisation that is better suited to them rather than to have to
understand the entire detailed global view conceptualisation. User viewpoints are
derived from the global ontology or existing viewpoints of it. This has the advantage
that it reduces the number of potential conceptualisations and their associated
mappings to be more computationally manageable. Whereas an ad hoc framework
based upon conventional distributed programming language and a rule framework
could be used to support user views and adaptation to user views, a more formal
framework has the benefit in that it can support reasoning about the consistency,
equivalence, containment and conflict resolution when traversing data models. A
preliminary formulation of the formal model has been undertaken and is based upon
extending a Datalog type algebra with hierarchical, attribute and instance value
operators. These operators can be applied to support compositional mapping and
consistency checking of data views. The multiple viewpoint system was implemented
as a Java-based application consisting of two sub-systems, one for viewpoint
adaptation and management, the other for query processing and query result
adjustment
Why reinvent the wheel: Let's build question answering systems together
Modern question answering (QA) systems need to flexibly integrate a number of components specialised to fulfil specific tasks in a QA pipeline. Key QA tasks include Named Entity Recognition and Disambiguation, Relation Extraction, and Query Building. Since a number of different software components exist that implement different strategies for each of these tasks, it is a major challenge to select and combine the most suitable components into a QA system, given the characteristics of a question. We study this optimisation problem and train classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. We then devise a greedy algorithm to identify the pipelines that include the suitable components and can effectively answer the given question. We implement this model within Frankenstein, a QA framework able to select QA components and compose QA pipelines. We evaluate the effectiveness of the pipelines generated by Frankenstein using the QALD and LC-QuAD benchmarks. These results not only suggest that Frankenstein precisely solves the QA optimisation problem but also enables the automatic composition of optimised QA pipelines, which outperform the static Baseline QA pipeline. Thanks to this flexible and fully automated pipeline generation process, new QA components can be easily included in Frankenstein, thus improving the performance of the generated pipelines
Coordination of DWH Long-Term Data Management: The Path Forward Workshop Report
Following the 2010 DWH Oil Spill a vast amount of environmental data was collected (e.g., 100,000+ environmental samples, 15 million+ publicly available records). The volume of data collected introduced a number of challenges including: data quality assurance, data storage, data integration, and long-term preservation and availability of the data. An effort to tackle these challenges began in June 2014, at a workshop focused on environmental disaster data management (EDDM) with respect to response and subsequent restoration. The previous EDDM collaboration improved communication and collaboration among a range of government, industry and NGO entities involved in disaster management. In June 2017, the first DWH Long-Term Data Management (LTDM) workshop focused on reviewing existing data management systems, opportunities to advance integration of these systems, the availability of data for restoration planning, project implementation and restoration monitoring efforts, and providing a platform for increased communication among the various data GOM entities. The June 2017 workshop resulted in the formation of three working groups: Data Management Standards, Interoperability and Discovery/Searchability. These working groups spent 2018 coordinating and addressing various complex topics related to DWH LTDM. On December 4th and 5th, 2018 the Coastal Response Research Center (CRRC), NOAA Office of Response and Restoration (ORR) and NOAA National Marine Fisheries Service (NFMS) Restoration Center (RC), co-sponsored a workshop entitled Deepwater Horizon Oil Spill (DWH) Long-Term Data Management (LTDM): The Path Forward at the NOAA Gulf of Mexico (GOM) Disaster Response Center (DRC) in Mobile, AL
Implementing OBDA for an end-user query answering service on an educational ontology
In the age where productivity of society is no longer defined by the amount of information
generated, but from the quality and assertiveness that a set of data may potentially hold,
the right questions to do depends on the semantic awareness capability that an
information system could evolve into. To address this challenge, in the last decade,
exhaustive research has been done in the Ontology Based Data Access (OBDA)
paradigm.
A conspectus of the most promising technologies with data integration capabilities and
the foundations where they rely are documented in this memory as a point of reference
for choosing tools that supports the incorporation of a conceptual model under a OBDA
method. The present study provides a practical approach for implementing an ontology
based data access service, to educational context users of a Learning Analytics initiative,
by means of allowing them to formulate intuitive enquiries with a familiar domain
terminology on top of a Learning Management System. The ontology used was
completely transformed to semantic linked data standards and some data mappings for
testing were included. Semantic Linked Data technologies exposed in this document may
exert modernization to environments in which object oriented and relational paradigms
may propagate heterogeneous and contradictory requirements. Finally, to validate the
implementation, a set of queries were constructed emulating the most relevant dynamics
of the model regarding the dataset nature
Ontologies for the Interoperability of Heterogeneous Multi-Agent Systems in the scope of Energy and Power Systems
Tesis por compendio de publicaciones[ES]El sector eléctrico, tradicionalmente dirigido por monopolios y poderosas
empresas de servicios públicos, ha experimentado cambios significativos en las
últimas décadas. Los avances más notables son una mayor penetración de las
fuentes de energía renovable (RES por sus siglas en inglés) y la generación
distribuida, que han llevado a la adopción del paradigma de las redes inteligentes
(SG por sus siglas en inglés) y a la introducción de enfoques competitivos en los
mercados de electricidad (EMs por sus siglas en inglés) mayoristas y algunos
minoristas. Las SG emergieron rápidamente de un concepto ampliamente
aceptado en la realidad. La intermitencia de las fuentes de energía renovable y su
integración a gran escala plantea nuevas limitaciones y desafíos que afectan en
gran medida las operaciones de los EMs. El desafiante entorno de los sistemas de
potencia y energía (PES por sus siglas en inglés) refuerza la necesidad de
estudiar, experimentar y validar operaciones e interacciones competitivas,
dinámicas y complejas. En este contexto, la simulación, el apoyo a la toma de
decisiones, y las herramientas de gestión inteligente, se vuelven imprescindibles
para estudiar los diferentes mecanismos del mercado y las relaciones entre los
actores involucrados. Para ello, la nueva generación de herramientas debe ser
capaz de hacer frente a la rápida evolución de los PES, proporcionando a los
participantes los medios adecuados para adaptarse, abordando nuevos modelos
y limitaciones, y su compleja relación con los desarrollos tecnológicos y de
negocios.
Las plataformas basadas en múltiples agentes son particularmente
adecuadas para analizar interacciones complejas en sistemas dinámicos, como
PES, debido a su naturaleza distribuida e independiente. La descomposición de
tareas complejas en asignaciones simples y la fácil inclusión de nuevos datos y
modelos de negocio, restricciones, tipos de actores y operadores, y sus
interacciones, son algunas de las principales ventajas de los enfoques basados en
agentes. En este dominio, han surgido varias herramientas de modelado para
simular, estudiar y resolver problemas de subdominios específicos de PES. Sin
embargo, existe una limitación generalizada referida a la importante falta de
interoperabilidad entre sistemas heterogéneos, que impide abordar el problema
de manera global, considerando todas las interrelaciones relevantes existentes.
Esto es esencial para que los jugadores puedan aprovechar al máximo las
oportunidades en evolución. Por lo tanto, para lograr un marco tan completo aprovechando las herramientas existentes que permiten el estudio de partes
específicas del problema global, se requiere la interoperabilidad entre estos
sistemas.
Las ontologías facilitan la interoperabilidad entre sistemas heterogéneos al
dar un significado semántico a la información intercambiada entre las distintas
partes. La ventaja radica en el hecho de que todos los involucrados en un dominio
particular los conocen, comprenden y están de acuerdo con la conceptualización
allí definida. Existen, en la literatura, varias propuestas para el uso de ontologías
dentro de PES, fomentando su reutilización y extensión. Sin embargo, la mayoría
de las ontologías se centran en un escenario de aplicación específico o en una
abstracción de alto nivel de un subdominio de los PES. Además, existe una
considerable heterogeneidad entre estos modelos, lo que complica su integración
y adopción. Es fundamental desarrollar ontologías que representen distintas
fuentes de conocimiento para facilitar las interacciones entre entidades de
diferente naturaleza, promoviendo la interoperabilidad entre sistemas
heterogéneos basados en agentes que permitan resolver problemas específicos de
PES.
Estas brechas motivan el desarrollo del trabajo de investigación de este
doctorado, que surge para brindar una solución a la interoperabilidad de
sistemas heterogéneos dentro de los PES. Las diversas aportaciones de este
trabajo dan como resultado una sociedad de sistemas multi-agente (MAS por sus
siglas en inglés) para la simulación, estudio, soporte de decisiones, operación y
gestión inteligente de PES. Esta sociedad de MAS aborda los PES desde el EM
mayorista hasta el SG y la eficiencia energética del consumidor, aprovechando
las herramientas de simulación y apoyo a la toma de decisiones existentes,
complementadas con las desarrolladas recientemente, asegurando la
interoperabilidad entre ellas. Utiliza ontologías para la representación del
conocimiento en un vocabulario común, lo que facilita la interoperabilidad entre
los distintos sistemas. Además, el uso de ontologías y tecnologías de web
semántica permite el desarrollo de herramientas agnósticas de modelos para una
adaptación flexible a nuevas reglas y restricciones, promoviendo el razonamiento
semántico para sistemas sensibles al contexto
- …