2 research outputs found
Enabling scientific data on the web
Scientific data does not exist on the Web in the same way as the written
word; reviews, media, wikis, social networks, and blogs all contribute to
the interconnected nature of ordinary language on the Web. Network effects
create additional value from seemingly minor contributions to the Web. But
nothing such as this exists for scientific data. Simply put, within the Open
Web Platform, we cannot currently turn and apply similar mechanisms for
scientific work without great effort. Thus, the Web has not so far enabled
Science as well as it has enabled dissemination and interconnection for the
written word: to truly enable Science on the Web, we must endeavor to make
data and its semantics first-class Web constituents.
This thesis focuses on solving this problem by enabling scientific data to exist
on the Web in such a way that it can be processed both as viewable content
and consumed data. Starting from the principles on which the Web has so
far thrived, we propose solutions to enable complex data exchanges while
preserving the Web as it stands. We introduce the Partition Annotate Name
(PAN) methodology, which relies upon embracing the core architectural
principles of the Web: name things with URIs; process common data formats;
use common rules under a shared contract between publisher, developer, and
consumer
Documents as functions
Treating variable data documents as functions over their data bindings opens opportunities for building more powerful, robust and flexible document architectures to meet the needs arising from the confluence of developments in document engineering, digital printing technologies and marketing analysis.
This thesis describes a combination of several XML-based technologies both to represent and to process variable documents and their data, leading to extensible, high-quality and 'higher-order' document generation solutions. The architecture (DDF) uses XML uniformly throughout the documents and their processing tools with interspersing of different semantic spaces being achieved through namespacing.
An XML-based functional programming language (XSLT) is used to describe all intra-document variability and for implementing most of the tools. Document layout intent is declared within a document as a hierarchical set of combinators attached to a tree-based graphical presentation. Evaluation of a document bound to an instance of data involves using a compiler to create an executable from the document, running this with the data instance as argument to create a new document with layout intent described, followed by resolution of that layout by an extensible layout processor.
The use of these technologies, with design paradigms and coding protocols, makes it possible to construct documents that not only have high flexibility and quality, but also perform in higher-order ways. A document can be partially bound to data and evaluated, modifying its presentation and still remaining variably responsive to future data. Layout intent can be re-satisfied as presentation trees are modified by programmatic sections embedded within them. The key enablers are described and illustrated through example