210 research outputs found
Breadth-first serialisation of trees and rational languages
We present here the notion of breadth-first signature and its relationship
with numeration system theory. It is the serialisation into an infinite word of
an ordered infinite tree of finite degree. We study which class of languages
corresponds to which class of words and,more specifically, using a known
construction from numeration system theory, we prove that the signature of
rational languages are substitutive sequences.Comment: 15 page
Regular Rooted Graph Grammars
In dieser Arbeit wir ein pragmatischer Ansatz zur Typisierung, statischen Analyse und Optimierung von Web-Anfragespachen, speziell Xcerpt, untersucht. Pragmatisch ist der Ansatz in dem Sinne, dass dem Benutzer keinerlei Einschränkungen aus Entscheidbarkeits- oder Effizienzgründen auf modellierbare Typen gestellt werden. Effizienz und Entscheidbarkeit werden stattdessen, falls nötig, durch Vergröberungen bei der Typprüfung erkauft.
Eine Typsprache zur Typisierung von Graph-strukturierten Daten im Web wird eingeführt. Modellierbare Graphen sind so genannte gewurzelte Graphen, welche aus einem Spannbaum und Querreferenzen aufgebaut sind. Die Typsprache basiert auf
reguläre Baum Grammatiken, welche um typisierte Referenzen erweitert wurde. Neben wie im Web mit XML üblichen geordneten strukturierten Daten, sind auch ungeordnete Daten, wie etwa in Xcerpt oder RDF üblich, modellierbar. Der dazu verwendete Ansatz---ungeordnete Interpretation Regulärer Ausdrücke---ist neu. Eine operationale Semantik für geordnete wie ungeordnete Typen wird auf Basis spezialisierter Baumautomaten und sog. Counting Constraints (welche wiederum auf presburgerarithmetische Ausdrücke) basieren. Es wird ferner statische Typ-Prüfung und -Inferenz von Xcerpt Anfrage- und Konstrukttermen, wie auch Optimierung von Xcerpt Anfragen auf Basis von Typinformation eingeführt.This thesis investigates a pragmatic approach to typing, static analysis and static
optimization of Web query languages, in special the Web query language Xcerpt. The
approach is pragmatic in the sense, that no restriction on the types are made for
decidability or efficiency reasons, instead precision is given up if necessary.
Pragmatics on the dynamic side means to use types not only to ensure validity of objects
operating on, but also influencing query selection based on types.
A typing language for typing of graph structured data on the Web is introduced.
The Graphs in mind are based on spanning trees with references, the typing languages
is based on regular tree grammars with typed reference extensions. Beside ordered data
in the spirit of XML, unordered data (i.e. in the spirit of the Xcerpt data model or
RDF) can be modelled using regular expressions under unordered interpretation – this
approach is new. An operational semantics for ordered and unordered types is given
based on specialized regular tree automata and counting constraints (them again based
on Presburger arithmetic formulae). Static type checking of Xcerpt query and construct
terms is introduced, as well as optimization of Xcerpt query terms based on schema
information
Automatic sequences: from rational bases to trees
The th term of an automatic sequence is the output of a deterministic
finite automaton fed with the representation of in a suitable numeration
system. In this paper, instead of considering automatic sequences built on a
numeration system with a regular numeration language, we consider these built
on languages associated with trees having periodic labeled signatures and, in
particular, rational base numeration systems. We obtain two main
characterizations of these sequences. The first one is concerned with -block
substitutions where morphisms are applied periodically. In particular, we
provide examples of such sequences that are not morphic. The second
characterization involves the factors, or subtrees of finite height, of the
tree associated with the numeration system and decorated by the terms of the
sequence.Comment: 25 pages, 15 figure
Automatic sequences in rational base numeration systems (and even more)
The nth term of an automatic sequence is the output of a deterministic finite automaton fed with the representation of n in a suitable numeration system. Here, instead of considering automatic sequences built on a numeration system with a regular numeration language, we consider these built on languages associated with trees having periodic labeled signatures and, in particular, rational base numeration systems. We obtain two main characterizations of these sequences. The first one is concerned with r-block substitutions where r morphisms are applied periodically. In particular, we provide examples of such sequences that are not morphic. The second characterization involves the factors, or subtrees of finite height, of the tree associated with the numeration system and decorated by the terms of the sequence
Trust on the semantic web
The Semantic Web is a vision to create a “web of knowledge”; an extension of the Web as we know it which will create an information space which will be usable by machines in very rich ways. The technologies which make up the Semantic Web allow machines to reason across information gathered from the Web, presenting only relevant results and inferences to the user. Users of the Web in its current form assess the credibility of the information they gather in a number of different ways. If processing happens without the user being able to check the source and credibility of each piece of information used in the processing, the user must be able to trust that the machine has used trustworthy information at each step of the processing. The machine should therefore be able to automatically assess the credibility of each piece of information it gathers from the Web. A case study on advanced checks for website credibility is presented, and the site presented in the case presented is found to be credible, despite failing many of the checks which are presented. A website with a backend based on RDF technologies is constructed. A better understanding of RDF technologies and good knowledge of the RAP and Redland RDF application frameworks is gained. The second aim of constructing the website was to gather information to be used for testing various trust metrics. The website did not gain widespread support, and therefore not enough data was gathered for this. Techniques for presenting RDF data to users were also developed during website development, and these are discussed. Experiences in gathering RDF data are presented next. A scutter was successfully developed, and the data smushed to create a database where uniquely identifiable objects were linked, even where gathered from different sources. Finally, the use of digital signature as a means of linking an author and content produced by that author is presented. RDF/XML canonicalisation is discussed in the provision of ideal cryptographic checking of RDF graphs, rather than simply checking at the document level. The notion of canonicalisation on the semantic, structural and syntactic levels is proposed. A combination of an existing canonicalisation algorithm and a restricted RDF/XML dialect is presented as a solution to the RDF/XML canonicalisation problem. We conclude that a trusted Semantic Web is possible, with buy in from publishing and consuming parties
Robert Louis Stevenson’s South Seas Writing: Its Production and Context within the Victorian Study of Culture
The thesis is split into two parts. The first part investigates the production, reception, and reconstruction of Robert Louis Stevenson's South Seas writing during his travels there in the years 1888-91, and its subsequent publication as In the South Seas in 1896. The second part concentrates on the findings that Stevenson makes during his Pacific travels in respect of the discourses of culture that shape the academic sciences of his day.
The writing that Stevenson produces for his 'Big Book' on the cultures of the Pacific Islands is among the least-examined of all his works. His book of travel, In the South Seas, is published posthumously and contains material that is presented in a way that is not intended by the author. In the present study the reasons for this situation are investigated, and the work that the author intends to produce is recovered on the basis of the plans, notes, and photographs which remain from the period of his travels. The reconstructed work is then compared with the published volume of In the South Seas to show the extent to which textual mutilation and re-editing has significantly altered the meaning of Stevenson's original writing. In the South Seas is a text that is re-shaped by his editor in order to satisfy what is seen as a Victorian readership's desire for sentimental voyaging.
Stevenson's writing is then read within the context of the Victorian study of culture as represented by Edward Burnett Tylor, to show his engagement with and criticism of some of Tylor's theories. Finally, the South Seas writing is framed within the perspective of Stevenson's reading of GWF Hegel, showing the extent to which his observations on Pacific landscapes and cultures are informed by Hegel's discussion of the antinomies of Immanuel Kant
Recommended from our members
Semantic-based framework for the generation of travel demand
Traffic and transportation have a wide-ranging impact on the daily lives of the human population and society. Activity-based travel demand generation models and traffic simulators are tools that have been developed to investigate traffic and transport problems and assist in developing solutions.
The closer modelling of human behaviour, the emergence of new technologies and the availability of more detailed datasets is leading to greater modelling complexity. The robustness of conclusions in investigations is supported by comparison of multiple techniques and models yet variations in the platform, data requirements and dataset availability present barriers to their breadth. This thesis investigates the development of a Semantic Web framework for activity-based travel demand generation.
It is proposed that the application of a knowledge-based approach and development of an orchestrating framework will enable a loosely coupled modular architecture. This approach will reduce the burden in preparing and accessing datasets through the construction of a platform-independent knowledge-base and facilitate switching between modules and datasets.
The principal contributions of this work are the application of a knowledge-based approach to travel demand generation; the development of a Semantic-based framework to control the configuration of the process and the design; and demonstration of the Semantic based framework through the implementation and evaluation of the modular travel demand generation process, including integration with two third-party traffic simulators.
The investigation found that the proposed approach can be successfully applied to model and control the travel demand generation process. Multiple configurations were explored, including utilising network communications, and found that this had a noticeable impact on execution duration but also the potential for mitigation through distributed computing.
This presents the opportunity for an online infrastructure of datasets and module implementations for travel demand generation that users can select and access through the framework. This infrastructure would remove the need for ad hoc interfaces; data format conversion or platform dependence to facilitate the process of traffic modelling becoming quicker and more robust
Engineering a semantic web trust infrastructure
The ability to judge the trustworthiness of information is an important and challenging problem in the field of Semantic Web research. In this thesis, we take an end-to-end look at the challenges posed by trust on the Semantic Web, and present contributions in three areas: a Semantic Web identity vocabulary, a system for bootstrapping trust environments, and a framework for trust aware information management. Typically Semantic Web agents, which consume and produce information, are not described with sufficient information to permit those interacting with them to make good judgements of trustworthiness. A descriptive vocabulary for agent identity is required to enable effective inter agent discourse, and the growth of trust and reputation within the Semantic Web; we therefore present such a foundational identity ontology for describing web-based agents.It is anticipated that the Semantic Web will suffer from a trust network bootstrapping problem. In this thesis, we propose a novel approach which harnesses open data to bootstrap trust in new trust environments. This approach brings together public records published by a range of trusted institutions in order to encourage trust in identities within new environments. Information integrity and provenance are both critical prerequisites for well-founded judgements of information trustworthiness. We propose a modification to the RDF Named Graph data model in order to address serious representational limitations with the named graph proposal, which affect the ability to cleanly represent claims and provenance records. Next, we propose a novel graph based approach for recording the provenance of derived information. This approach offers computational and memory savings while maintaining the ability to answer graph-level provenance questions. In addition, it allows new optimisations such as strategies to avoid needless repeat computation, and a delta-based storage strategy which avoids data duplication.<br/
Interactive global illumination on the CPU
Computing realistic physically-based global illumination in real-time remains one
of the major goals in the fields of rendering and visualisation; one that has not
yet been achieved due to its inherent computational complexity. This thesis focuses
on CPU-based interactive global illumination approaches with an aim to
develop generalisable hardware-agnostic algorithms. Interactive ray tracing is reliant
on spatial and cache coherency to achieve interactive rates which conflicts
with needs of global illumination solutions which require a large number of incoherent
secondary rays to be computed. Methods that reduce the total number of
rays that need to be processed, such as Selective rendering, were investigated to
determine how best they can be utilised.
The impact that selective rendering has on interactive ray tracing was analysed
and quantified and two novel global illumination algorithms were developed,
with the structured methodology used presented as a framework. Adaptive Inter-
leaved Sampling, is a generalisable approach that combines interleaved sampling
with an adaptive approach, which uses efficient component-specific adaptive guidance
methods to drive the computation. Results of up to 11 frames per second
were demonstrated for multiple components including participating media. Temporal Instant Caching, is a caching scheme for accelerating the computation of
diffuse interreflections to interactive rates. This approach achieved frame rates
exceeding 9 frames per second for the majority of scenes. Validation of the results
for both approaches showed little perceptual difference when comparing
against a gold-standard path-traced image. Further research into caching led to
the development of a new wait-free data access control mechanism for sharing the
irradiance cache among multiple rendering threads on a shared memory parallel
system. By not serialising accesses to the shared data structure the irradiance
values were shared among all the threads without any overhead or contention,
when reading and writing simultaneously. This new approach achieved efficiencies
between 77% and 92% for 8 threads when calculating static images and animations.
This work demonstrates that, due to the
flexibility of the CPU, CPU-based
algorithms remain a valid and competitive choice for achieving global illumination
interactively, and an alternative to the generally brute-force GPU-centric
algorithms
Terminological Methods in Lexicography: Conceptualising, Organising, and Encoding Terms in General Language Dictionaries
Os dicionários de língua geral apresentam inconsistências de uniformização e cientificidade no tratamento do conteúdo lexicográfico especializado. Analisando a presença e o tratamento de termos em dicionários de língua geral, propomos um tratamento mais uniforme e cientificamente rigoroso desse conteúdo, considerando também a necessidade de compilar e alinhar futuros recursos lexicais em consonância com padrões interoperáveis. Partimos da premissa de que o tratamento dos itens lexicais, sejam unidades lexicais (palavras em geral) ou unidades terminológicas (termos ou palavras pertencentes a determinados domínios), deve ser diferenciado, e recorremos a métodos terminológicos para tratar os termos dicionarizados. A nossa abordagem assume que a terminologia – na sua dupla dimensão linguística e conceptual – e a lexicografia, como domínios interdisciplinares, podem ser complementares. Assim, apresentamos objetivos teóricos (aperfeiçoamento da metalinguagem e descrição lexicográfica a partir de pressupostos terminológicos) e práticos (representação consistente de dados lexicográficos), que visam facilitar a organização, descrição e modelização consistente de componentes lexicográficos, nomeadamente a hierarquização das etiquetas de domínio, que são marcadores de identificação de léxico especializados. Queremos ainda facilitar a redação de definições, as quais podem ser otimizadas e elaboradas com maior precisão científica ao seguir uma abordagem terminológica no tratamento dos termos. Analisámos os dicionários desenvolvidos por três instituições académicas distintas: a Academia das Ciências de Lisboa, a Real Academia Española e a Académie Française, que representam um valioso legado da tradição lexicográfica académica europeia. A análise inicial inclui um levantamento exaustivo e a comparação das etiquetas de domínio usadas, bem como um debate sobre as opções escolhidas e um estudo comparativo do tratamento dos termos. Elaborámos, depois, uma proposta metodológica para o tratamento de termos em dicionários de língua geral, tomando como exemplo dois domínios, GEOLOGIA e FUTEBOL, extraídos da edição de 2001 do dicionário da Academia das Ciências de Lisboa. Revimos os termos selecionados de acordo com os princípios terminológicos defendidos, dando origem a sentidos especializados revistos/novos para a primeira edição digital deste dicionário. Representamos e anotamos os dados usando as especificações da TEI Lex-0, uma extensão da TEI (Text Encoding Initiative), dedicada à codificação de dados lexicográficos. Destacamos também a importância de ter etiquetas de domínio hierárquicas em vez de uma lista simples de domínios, vantajosas para a organização dos dados, correspondência e possíveis futuros alinhamentos entre diferentes recursos lexicográficos. A investigação revelou que a) os modelos estruturais dos recursos lexicais são complexos e contêm informação de natureza diversa; b) as etiquetas de domínio nos dicionários gerais da língua são planas, desequilibradas, inconsistentes e, muitas vezes, estão desatualizadas, havendo necessidade de as hierarquizar para organizar o conhecimento especializado; c) os critérios adotados para a marcação dos termos e as fórmulas utilizadas na definição são díspares; d) o tratamento dos termos é heterogéneo e formulado de diferentes formas, pelo que o recurso a métodos terminológicos podem ajudar os lexicógrafos a redigir definições; e) a aplicação de métodos terminológicos e lexicográficos interdisciplinares, e também de padrões, é vantajosa porque permite a construção de bases de dados lexicais estruturadas, concetualmente organizadas, apuradas do ponto de vista linguístico e interoperáveis. Em suma, procuramos contribuir para a questão urgente de resolver problemas que afetam a partilha, o alinhamento e vinculação de dados lexicográficos.General language dictionaries show inconsistencies in terms of uniformity and scientificity in the treatment of specialised lexicographic content. By analysing the presence and treatment of terms in general language dictionaries, we propose a more uniform and scientifically rigorous treatment of this content, considering the necessity of compiling and aligning future lexical resources according to interoperable standards. We begin from the premise that the treatment of lexical items, whether lexical units (words in general) or terminological units (terms or words belonging to particular subject fields), must be differentiated, and resort to terminological methods to treat dictionary terms. Our approach assumes that terminology – in its dual dimension, both linguistic and conceptual – and lexicography, as interdisciplinary domains, can be complementary. Thus, we present theoretical (improvement of metalanguage and lexicographic description based on terminological assumptions) and practical (consistent representation of lexicographic data) objectives that aim to facilitate the organisation, description and consistent modelling of lexicographic components, namely the hierarchy of domain labels, as they are specialised lexicon identification markers. We also want to facilitate the drafting of definitions, which can be optimised and elaborated with greater scientific precision by following a terminological approach for the treatment of terms. We analysed the dictionaries developed by three different academic institutions: the Academia das Ciências de Lisboa, the Real Academia Española and the Académie Française, which represent a valuable legacy of the European academic lexicographic tradition. The initial analysis includes an exhaustive survey and comparison of the domain labels used, as well as a debate on the chosen options and a comparative study of the treatment of the terms. We then developed a methodological proposal for the treatment of terms in general language dictionaries, exemplified using terms from two domains, GEOLOGY and FOOTBALL, taken from the 2001 edition of the dictionary of the Academia das Ciências de Lisboa. We revised the selected terms according to the defended terminological principles, giving rise to revised/new specialised meanings for the first digital edition of this dictionary. We represent and annotate the data using the TEI Lex-0 specifications, a TEI (Text Encoding Initiative) subset for encoding lexicographic data. We also highlight the importance of having hierarchical domain labels instead of a simple list of domains, which are beneficial to the data organisation itself, correspondence and possible future alignments between different lexicographic resources. Our investigation revealed the following: a) structural models of lexical resources are complex and contain information of a different nature; b) domain labels in general language dictionaries are flat, unbalanced, inconsistent and often outdated, requiring the need to hierarchise them for organising specialised knowledge; c) the criteria adopted for marking terms and the formulae used in the definition are disparate; d) the treatment of terms is heterogeneous and formulated differently, whereby terminological methods can help lexicographers to draft definitions; e) the application of interdisciplinary terminological and lexicographic methods, and of standards, is advantageous because it allows the construction of structured, conceptually organised, linguistically accurate and interoperable lexical databases. In short, we seek to contribute to the urgent issue of solving problems that affect the sharing, alignment and linking of lexicographic data
- …