8 research outputs found
From Text to Knowledge with Graphs: modelling, querying and exploiting textual content
This paper highlights the challenges, current trends, and open issues related
to the representation, querying and analytics of content extracted from texts.
The internet contains vast text-based information on various subjects,
including commercial documents, medical records, scientific experiments,
engineering tests, and events that impact urban and natural environments.
Extracting knowledge from this text involves understanding the nuances of
natural language and accurately representing the content without losing
information. This allows knowledge to be accessed, inferred, or discovered. To
achieve this, combining results from various fields, such as linguistics,
natural language processing, knowledge representation, data storage, querying,
and analytics, is necessary. The vision in this paper is that graphs can be a
well-suited text content representation once annotated and the right querying
and analytics techniques are applied. This paper discusses this hypothesis from
the perspective of linguistics, natural language processing, graph models and
databases and artificial intelligence provided by the panellists of the DOING
session in the MADICS Symposium 2022
Statix - statistical type inference on linked data
Large knowledge bases typically contain data adhering to various schemas with incomplete and/or noisy type information. This seriously complicates further integration and post-processing efforts, as type information is crucial in correctly handling the data. In this paper, we introduce a novel statistical type inference method, called StaTIX, to effectively infer instance types in Linked Data sets in a fully unsupervised manner. Our inference technique leverages a new hierarchical clustering algorithm that is robust, highly effective, and scalable. We introduce a novel approach to reduce the processing complexity of the similarity matrix specifying the relations between various instances in the knowledge base. This approach speeds up the inference process while also improving the correctness of the inferred types due to the noise attenuation in the input data. We further optimize the clustering process by introducing a dedicated hash function that speeds up the inference process by orders of magnitude without negatively affecting its accuracy. Finally, we describe a new technique to identify representative clusters from the multi-scale output of our clustering algorithm to further improve the accuracy of the inferred types. We empirically evaluate our approach on several real-world datasets and compare it to the state of the art. Our results show that StaTIX is more efficient than existing methods (both in terms of speed and memory consumption) as well as more effective. StaTIX reduces the F1-score error of the predicted types by about 40% on average compared to the state of the art and improves the execution time by orders of magnitude
Superhuman, Transhuman, Post/Human: Mapping the Production and Reception of the Posthuman Body
The figure of the cyborg, or more latterly, the posthuman body has been an increasingly familiar presence in a number of academic disciplines. The majority of such studies have focused on popular culture, particularly the depiction of the posthuman in science-fiction, fantasy and horror. To date however, few studies have focused on the posthuman and the comic book superhero, despite their evident corporeality, and none have questioned comics’ readers about their responses to the posthuman body. This thesis presents a cultural history of the posthuman body in superhero comics along with the findings from twenty-five, two-hour interviews with readers.
By way of literature reviews this thesis first provides a new typography of the posthuman, presenting it not as a stable bounded subject but as what Deleuze and Guattari (1987) describe as a ‘rhizome’. Within the rhizome of the posthuman body are several discursive plateaus that this thesis names Superhumanism (the representation of posthuman bodies in popular culture), Post/Humanism (a critical-theoretical stance that questions the assumptions of Humanism) and Transhumanism (the philosophy and practice of human enhancement with technology). With these categories in mind the thesis explores the development of the posthuman in body in the Superhuman realm of comic books. Exploring the body-types most prominent during the Golden (1938-1945), Silver (1958-1974) and contemporary Ages of superheroes it presents three explorations of what I term the Perfect Body, Cosmic Body and Military-Industrial Body respectively. These body types are presented as ‘assemblages’ (Delueze and Guattari, 1987) that display rhizomatic connections to the other discursive realms of the Post/Human and Transhuman. This investigation reveals how the depiction of the Superhuman body developed and diverged from, and sometimes back into, these realms as each attempted to territorialise the meaning and function of the posthuman body. Ultimately it describes how, in spite of attempts by nationalistic or economic interests to control Transhuman enhancement in real-world practices, the realms of Post/Humanism and Superhumanism share a more critical approach.
The final section builds upon this cultural history of the posthuman body by addressing reader’s relationship with these images. This begins by refuting some of the common assumptions in comics studies about superheroes and bodily representations. Readers stated that they viewed such imagery as iconographic rather than representational, whether it was the depiction of bodies or technology. Moreover, regular or committed readers of superhero comics were generally suspicious of the notion of human enhancement, displaying a belief in the same binary categories -artificial/natural, human/non-human - that critical Post/Humanism seeks to problematize.
The thesis concludes that while superhero comics remain ultimately too human to be truly Post/Humanist texts, it is never the less possible to conceptualise the relationship between reader, text, producer and so on in Post/Humanist terms as reading-assemblage, and that such a cyborgian fusing of human and comic book allow both bodies to ‘become other’, to move in new directions and form new assemblages not otherwise possible when considered separately
Exploring a striped XML world
EXtensible Markup Language, XML, was designed as a markup language for structuring,
storing and transporting data on the World Wide Web. The focus of XML is on
data content; arbitrary markup is used to describe data. This versatile, self-describing
data representation has established XML as the universal data format and the de facto
standard for information exchange on the Web. This has gradually given rise to the
need for efficient storage and querying of large XML repositories. To that end, we
propose a new model for building a native XML store which is based on a generalisation
of vertical decomposition. Nodes of a document satisfying the same label-path,
are extracted and stored together in a single container, a Stripe. Stripes make use of
a labelling scheme allowing us to maintain full structural information. Over this new
representation, we introduce various evaluation techniques, which allow us to handle
a large fragment of XPath 2.0. We also focus on the optimisation opportunities that
arise from our decomposition model during any query evaluation phase. During query
validation, we present an input minimisation process that exploits the proposed model
for identifying input that is only relevant to the given query, in terms of Stripes. We
also define query equivalence rules for query rewriting over our proposed model. Finally,
during query optimisation, we deal with whether and under which circumstances
certain evaluation algorithms can be replaced by others having lower I/O and/or CPU
cost. We propose three storage schemes under our general decomposition technique.
The schemes differ in the compression method imposed on the structural part of the
XML document. The first storage scheme imposes no compression. The second storage
scheme exploits structural regularities of the document to minimise storage and, thus,
I/O cost during query evaluation. Finally, the third storage scheme performs structureagnostic
compression of the document structure which results in minimised storage,
regardless the actual XML structure. We experiment on XML repositories of varying
size, recursion and structural regularity. We consider query input size, execution plan
size and query response time as metrics for our experimental results. We process query
workloads by applying each of the proposed optimisations in isolation and then all of
their combinations. In addition, we apply the same execution pipeline for all proposed
storage schemes. As a reference to our proposed query evaluation pipeline, we use
the current state-of-the-art system for XML query processing. Our results demonstrate
that:
• Our proposed data model provides the infrastructure for efficiently selecting the parts of the document that are relevant to a given query.
• The application of query rewriting, combined with input minimisation, reduces
query input size as well as the number of physical operators used. In addition,
when evaluation algorithms are specialised to the decomposition method, query
response time is further reduced.
• Query evaluation performance is largely affected by the storage schemes, which
are closely related to the structural properties of the data. The achieved compression
ratio greatly affects storage size and therefore, query response times
An Algebraic Approach to XQuery Optimization
As more data is stored in XML and more applications need to process this data, XML query optimization becomes performance critical. While optimization techniques for relational databases have been developed over the last thirty years, the optimization of XML queries poses new challenges. Query optimizers for XQuery, the standard query language for XML data, need to consider both document order and sequence order. Nevertheless, algebraic optimization proved powerful in query optimizers in relational and object oriented databases. Thus, this dissertation presents an algebraic approach to XQuery optimization. In this thesis, an algebra over sequences is presented that allows for a simple translation of XQuery into this algebra. The formal definitions of the operators in this algebra allow us to reason formally about algebraic optimizations. This thesis leverages the power of this formalism when unnesting nested XQuery expressions. In almost all cases unnesting nested queries in XQuery reduces query execution times from hours to seconds or milliseconds. Moreover, this dissertation presents three basic algebraic patterns of nested queries. For every basic pattern a decision tree is developed to select the most effective unnesting equivalence for a given query. Query unnesting extends the search space that can be considered during cost-based optimization of XQuery. As a result, substantially more efficient query execution plans may be detected. This thesis presents two more important cases where the number of plan alternatives leads to substantially shorter query execution times: join ordering and reordering location steps in path expressions. Our algebraic framework detects cases where document order or sequence order is destroyed. However, state-of-the-art techniques for order optimization in cost-based query optimizers have efficient mechanisms to repair order in these cases. The results obtained for query unnesting and cost-based optimization of XQuery underline the need for an algebraic approach to XQuery optimization for efficient XML query processing. Moreover, they are applicable to optimization in relational databases where order semantics are considered
Bowdoin Orient v.116, no.1-27 (1986-1987)
https://digitalcommons.bowdoin.edu/bowdoinorient-1980s/1007/thumbnail.jp