2,198 research outputs found
Reasonable Goals
Assume that a number of autonomous agents are going to act in such a way that their respective goal states constitute a global plan. A main question that arises in this situation is whether there is such a plan at all, i.e. whether a solvable conflict prevails. In some sense. this means that the set of common goals is non-empty. Furthermore, if the agents are allowed to act in accordance with the result of some decision process, a situation may occur where subsets of their possible goal sets are consistent, but in actual fact the individual agents may nevertheless always terminate in states that are in conflict. We present a formal framework for the analysis of conflicts in sets of autonomous agents restricted in the sense that they can be described in a (first-order) language and by a transaction mechanism. This is also enriched by processes for evaluating decision situations given imprecise background information. The agent specifications are analysed with respect to a concept of consistency that requires the formulae of one specification together with a set of correspondence assertions to not restrict the models of another specification. i.e. the agent system does not essentially restrict the individual agents. The main emphasis is on the specifications being compatible with respect to reasonable probable states. i.e. states for which it is reasonable to assume that they eventually will be reached
A semantic and agent-based approach to support information retrieval, interoperability and multi-lateral viewpoints for heterogeneous environmental databases
PhDData stored in individual autonomous databases often needs to be combined and
interrelated. For example, in the Inland Water (IW) environment monitoring domain,
the spatial and temporal variation of measurements of different water quality indicators
stored in different databases are of interest. Data from multiple data sources is more
complex to combine when there is a lack of metadata in a computation forin and when
the syntax and semantics of the stored data models are heterogeneous. The main types
of information retrieval (IR) requirements are query transparency and data
harmonisation for data interoperability and support for multiple user views. A
combined Semantic Web based and Agent based distributed system framework has
been developed to support the above IR requirements. It has been implemented using
the Jena ontology and JADE agent toolkits. The semantic part supports the
interoperability of autonomous data sources by merging their intensional data, using a
Global-As-View or GAV approach, into a global semantic model, represented in
DAML+OIL and in OWL. This is used to mediate between different local database
views. The agent part provides the semantic services to import, align and parse
semantic metadata instances, to support data mediation and to reason about data
mappings during alignment. The framework has applied to support information
retrieval, interoperability and multi-lateral viewpoints for four European environmental
agency databases.
An extended GAV approach has been developed and applied to handle queries that can
be reformulated over multiple user views of the stored data. This allows users to
retrieve data in a conceptualisation that is better suited to them rather than to have to
understand the entire detailed global view conceptualisation. User viewpoints are
derived from the global ontology or existing viewpoints of it. This has the advantage
that it reduces the number of potential conceptualisations and their associated
mappings to be more computationally manageable. Whereas an ad hoc framework
based upon conventional distributed programming language and a rule framework
could be used to support user views and adaptation to user views, a more formal
framework has the benefit in that it can support reasoning about the consistency,
equivalence, containment and conflict resolution when traversing data models. A
preliminary formulation of the formal model has been undertaken and is based upon
extending a Datalog type algebra with hierarchical, attribute and instance value
operators. These operators can be applied to support compositional mapping and
consistency checking of data views. The multiple viewpoint system was implemented
as a Java-based application consisting of two sub-systems, one for viewpoint
adaptation and management, the other for query processing and query result
adjustment
Formal design of data warehouse and OLAP systems : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand
A data warehouse is a single data store, where data from multiple data sources is integrated for online business analytical processing (OLAP) of an entire organisation. The rationale being single and integrated is to ensure a consistent view of the organisational business performance independent from different angels of business perspectives. Due to its wide coverage of subjects, data warehouse design is a highly complex, lengthy and error-prone process. Furthermore, the business analytical tasks change over time, which results in changes in the requirements for the OLAP systems. Thus, data warehouse and OLAP systems are rather dynamic and the design process is continuous. In this thesis, we propose a method that is integrated, formal and application-tailored to overcome the complexity problem, deal with the system dynamics, improve the quality of the system and the chance of success.
Our method comprises three important parts: the general ASMs method with types, the application tailored design framework for data warehouse and OLAP, and the schema integration method with a set of provably correct refinement rules.
By using the ASM method, we are able to model both data and operations in a uniform conceptual framework, which enables us to design an integrated approach for data warehouse and OLAP design. The freedom given by the ASM method allows us to model the system at an abstract level that is easy to understand for both users and designers. More specifically, the language allows us to use the terms from the user domain not biased by the terms used in computer systems. The pseudo-code like transition rules, which gives the simplest form of operational semantics in ASMs, give the closeness to programming languages for designers to understand. Furthermore, these rules are rooted in mathematics to assist in improving the quality of the system design.
By extending the ASMs with types, the modelling language is tailored for data warehouse with the terms that are well developed for data-intensive applications, which makes it easy to model the schema evolution as refinements in the dynamic data warehouse design.
By providing the application-tailored design framework, we break down the design complexity by business processes (also called subjects in data warehousing) and design concerns. By designing the data warehouse by subjects, our method resembles Kimball's "bottom-up" approach. However, with the schema integration method, our method resolves the stovepipe issue of the approach. By building up a data warehouse iteratively in an integrated framework, our method not only results in an integrated data warehouse, but also resolves the issues of complexity and delayed ROI (Return On Investment) in Inmon's "top-down" approach. By dealing with the user change requests in the same way as new subjects, and modelling data and operations explicitly in a three-tier architecture, namely the data sources, the data warehouse and the OLAP (online Analytical Processing), our method facilitates dynamic design with system integrity.
By introducing a notion of refinement specific to schema evolution, namely schema refinement, for capturing the notion of schema dominance in schema integration, we are able to build a set of correctness-proven refinement rules. By providing the set of refinement rules, we simplify the designers's work in correctness design verification. Nevertheless, we do not aim for a complete set due to the fact that there are many different ways for schema integration, and neither a prescribed way of integration to allow designer favored design.
Furthermore, given its °exibility in the process, our method can be extended for new emerging design issues easily
Modellgetriebene Entwicklung inhaltsbasierter Bildretrieval-Systeme auf der Basis von objektrelationalen Datenbank-Management-Systeme
In this thesis, the model-driven software development paradigm is employed in order to support the development of Content-based Image Retrieval Systems (CBIRS) for different application domains.
Modeling techniques, based on an adaptable conceptual framework model, are proposed for deriving the components of a concrete CBIRS. Transformation techniques are defined to automatically implement the derived application specific models in an object-relational database management system. A set of criteria assuring the quality of the transformation are derived from the theory for preserving information capacity applied in database design.In dieser Dissertation wird das Paradigma des modellgetriebenen Softwareentwurfs für die Erstellung von inhaltsbasierten Bildretrieval-Systemen verwendet. Ein adaptierbares Frameworkmodell wird für die Ableitung des Modells eines konkreten Bildretrieval-Systems eingesetzt. Transformationstechniken für die automatische Generierung von Implementierungen in Objektorientierten Datenbank-Management-Systemen aus dem konzeptuellen Modell werden erarbeitet. Die aus der Theorie des Datenbankentwurfs bekannten Anforderungen zur Kapazitätserhaltung der Transformation werden verwendet, um Kriterien für die erforderliche Qualität der Transformation zu definieren
Developing Collaborative XML Editing Systems
In many areas the eXtensible Mark-up Language (XML) is becoming the standard exchange and data format. More and more applications not only support XML as an exchange format but also use it as their data model or default file format for graphic, text and database (such as spreadsheet) applications. Computer Supported Cooperative Work is an interdisciplinary field of research dealing with group work, cooperation and their supporting information and communication technologies. One part of it is Real-Time Collaborative Editing, which investigates the design of systems which allow several persons to work simultaneously in real-time on the same document, without the risk of inconsistencies.
Existing collaborative editing research applications specialize in one or at best, only a small number of document types; for example graphic, text or spreadsheet documents. This research investigates the development of a software framework which allows collaborative editing of any XML document type in real-time. This presents a more versatile solution to the problems of real-time collaborative editing.
This research contributes a new software framework model which will assist software engineers in the development of new collaborative XML editing applications. The devised framework is flexible in the sense that it is easily adaptable to different workflow requirements covering concurrency control, awareness mechanisms and optional locking of document parts. Additionally this thesis contributes a new framework integration strategy that enables enhancements of existing single-user editing
applications with real-time collaborative editing features without changing their source code
A relational algebra approach to ETL modeling
The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and PortoInformation Technology has been one of drivers of the revolution that currently is happening in
today’s management decisions in most organizations. The amount of data gathered and processed
through the use of computing devices has been growing every day, providing a valuable source of
information for decision makers that are managing every type of organization, public or private.
Gathering the right amount of data in a centralized and unified repository like a data warehouse is
similar to build the foundations for a system that will act has a base to support decision making
processes requiring factual information. Nevertheless, the complexity of building such a repository
is very challenging, as well as developing all the components of a data warehousing system. One
of the most critical components of a data warehousing system is the Extract-Transform-Load
component, ETL for short, which is responsible for gathering data from information sources, clean,
transform and conform it in order to store it in a data warehouse. Several designing methodologies
for the ETL components have been presented in the last few years with very little impact in ETL
commercial tools. Basically, this was due to an existing gap between the conceptual design of an
ETL system and its correspondent physical implementation. The methodologies proposed ranged
from new approaches, with novel notation and diagrams, to the adoption and expansion of current
standard modeling notations, like UML or BPMN. However, all these proposals do not contain
enough detail to be translated automatically into a specific execution platform. The use of a
standard well-known notation like Relational Algebra might bridge the gap between the conceptual
design and the physical design of an ETL component, mainly due to its formal approach that is
based on a limited set of operators and also due to its functional characteristics like being a
procedural language operating over data stored in relational format. The abstraction that Relational
Algebra provides over the technological infrastructure might also be an advantage for uncommon execution platforms, like computing grids that provide an exceptional amount of processing power
that is very critical for ETL systems. Additionally, partitioning data and task distribution over
computing nodes works quite well with a Relational Algebra approach. An extensive research over
the use of Relational Algebra in the ETL context was conducted to validate its usage. To
complement this, a set of Relational Algebra patterns were also developed to support the most
common ETL tasks, like changing data capture, data quality enforcement, data conciliation and
integration, slowly changing dimensions and surrogate key pipelining. All these patterns provide a
formal approach to the referred ETL tasks by specifying all the operations needed to accomplish
them in a series of Relational Algebra operations. To evaluate the feasibility of the work done in
this thesis, we used a real ETL application scenario for the extraction of data in two different social
networks operational systems, storing hashtag usage information in a specific data mart. The
ability to analyze trends in social network usage is a hot topic in today’s media and information
coverage. A complete design of the ETL component using the patterns developed previously is also
provided, as well as a critical evaluation of its usage.As Tecnologias da Informação têm sido um dos principais catalisadores na revolução que se assiste
nas tomadas de decisão na maioria das organizações. A quantidade de dados que são angariados e
processados através do uso de dispositivos computacionais tem crescido diariamente, tornando-se
uma fonte de informação valiosa para os decisores que gerem todo o tipo de organizações,
públicas ou privadas. Concentrar o conjunto ideal de dados num repositório centralizado e
unificado, como um data warehouse, é essencial para a construção de um sistema que servirá de
suporte aos processos de tomada de decisão que necessitam de factos. No entanto, a
complexidade associada à construção deste repositório e de todos os componentes que
caracterizam um sistema de data warehousing é extremamente desafiante. Um dos componentes
mais críticos de um sistema de data warehousing é a componente de Extração-Transformação-
Alimentação (ETL) que lida com a extração de dados das fontes, que limpa, transforma e concilia
os dados com vista à sua integração no data warehouse. Nos últimos anos têm sido apresentadas
várias metodologias de desenho da componente de ETL, no entanto estas não têm sido adotadas
pelas ferramentas comerciais de ETL principalmente devido ao diferencial existente entre o
desenho concetual e as plataformas físicas de execução. As metodologias de desenho propostas
variam desde propostas que assentam em novas notações e diagramas até às propostas que usam
notações standard como a notação UML e BPMN que depois são complementadas com conceitos
de ETL. Contudo, estas propostas de modelação concetual não contêm informações detalhadas
que permitam uma tradução automática para plataformas de execução. A utilização de uma
linguagem standard e reconhecida como a linguagem de Álgebra Relacional pode servir como
complemento e colmatar o diferencial existente entre o desenho concetual e o desenho físico da
componente de ETL, principalmente devido ao facto de esta linguagem assentar numa abordagem procedimental com um conjunto limitado de operadores que atuam sobre dados armazenados num
formato relacional. A abstração providenciada pela Álgebra Relacional relativamente às plataformas
de execução pode eventualmente ser uma vantagem tendo em vista a utilização de plataformas
menos comuns, como por exemplo grids computacionais. Este tipo de arquiteturas disponibiliza por
norma um grande poder computacional o que é essencial para um sistema de ETL. O
particionamento e distribuição dos dados e tarefas pelos nodos computacionais conjugam
relativamente bem com a abordagem da Álgebra Relacional. No decorrer deste trabalho foi
efetuado um estudo extensivo às propriedades da AR num contexto de ETL com vista à avaliação
da sua usabilidade. Como complemento, foram desenhados um conjunto de padrões de AR que
suportam as atividades mais comuns de ETL como por exemplo changing data capture, data
quality enforcement, data conciliation and integration, slowly changing dimensions e surrogate key
pipelining. Estes padrões formalizam este conjunto de atividades ETL, especificando numa série de
operações de Álgebra Relacional quais os passos necessários à sua execução. Com vista à
avaliação da sustentabilidade da proposta presente neste trabalho, foi utilizado um cenário real de
ETL em que os dados fontes pertencem a duas redes sociais e os dados armazenados no data
mart identificam a utilização de hashtags por parte dos seus utilizadores. De salientar que a
deteção de tendências e de assuntos que estão na ordem do dia nas redes sociais é de vital
importância para as empresas noticiosas e para as próprias redes sociais. Por fim, é apresentado o
desenho completo do sistema de ETL para o cenário escolhido, utilizando os padrões desenvolvidos
neste trabalho, avaliando e criticando a sua utilização
- …