537 research outputs found
Active document enrichment using adaptive information extraction from text
The traditional process of document annotation for knowledge identification and extraction in the Semantic Web (SW) is complex and time consuming, as it requires manual annotation by domain experts. There is currently a strong interest in Text Mining technologies (and in particular in Human Language-based Technologies), for reducing the burden of text annotation for Knowledge Management (KM). In this poster we present Melita, an annotation interface that uses Adaptive Information Extraction from texts for reducing the burden of text annotation.peer-reviewe
Timely and nonintrusive active document annotation via adaptive information extraction
The current work has been carried on in the framework of the AKT project (Advanced Knowledge Technologies, http://www.aktors.org), an Interdisciplinary Research Collaboration (IRC) sponsored by the UK Engineering and Physical Sciences Research Council (grant GR/N15764/01). AKT involves the Universities of Aberdeen, Edinburgh, Sheffield, Southampton and the Open University (www.aktors.org). AKT is a multimillion pound six year research project that started in 2000.The process of document annotation for the Semantic Web is complex and time consuming, as it requires a great deal of manual annotation. Information extraction from texts (IE) is a technology used by some of the most recent systems for actively supporting users in the process and reducing the burden of annotation. The integration of IE systems in annotation tools is quite a new development and in our opinion there is still the necessity of thinking the impact of the IE system in the process of annotation. In this paper we discuss two main requirements for active annotation: timeliness and tuning of intrusiveness. Then we present and discuss a model of interaction that addresses the two issues and Melita, an annotation framework that implements such methodology.peer-reviewe
An Expressive Model for the Web Infrastructure: Definition and Application to the BrowserID SSO System
The web constitutes a complex infrastructure and as demonstrated by numerous
attacks, rigorous analysis of standards and web applications is indispensable.
Inspired by successful prior work, in particular the work by Akhawe et al. as
well as Bansal et al., in this work we propose a formal model for the web
infrastructure. While unlike prior works, which aim at automatic analysis, our
model so far is not directly amenable to automation, it is much more
comprehensive and accurate with respect to the standards and specifications. As
such, it can serve as a solid basis for the analysis of a broad range of
standards and applications.
As a case study and another important contribution of our work, we use our
model to carry out the first rigorous analysis of the BrowserID system (a.k.a.
Mozilla Persona), a recently developed complex real-world single sign-on system
that employs technologies such as AJAX, cross-document messaging, and HTML5 web
storage. Our analysis revealed a number of very critical flaws that could not
have been captured in prior models. We propose fixes for the flaws, formally
state relevant security properties, and prove that the fixed system in a
setting with a so-called secondary identity provider satisfies these security
properties in our model. The fixes for the most critical flaws have already
been adopted by Mozilla and our findings have been rewarded by the Mozilla
Security Bug Bounty Program.Comment: An abridged version appears in S&P 201
Analyzing the BrowserID SSO System with Primary Identity Providers Using an Expressive Model of the Web
BrowserID is a complex, real-world Single Sign-On (SSO) System for web
applications recently developed by Mozilla. It employs new HTML5 features (such
as web messaging and web storage) and cryptographic assertions to provide
decentralized login, with the intent to respect users' privacy. It can operate
in a primary and a secondary identity provider mode. While in the primary mode
BrowserID runs with arbitrary identity providers (IdPs), in the secondary mode
there is one IdP only, namely Mozilla's default IdP.
We recently proposed an expressive general model for the web infrastructure
and, based on this web model, analyzed the security of the secondary IdP mode
of BrowserID. The analysis revealed several severe vulnerabilities.
In this paper, we complement our prior work by analyzing the even more
complex primary IdP mode of BrowserID. We do not only study authentication
properties as before, but also privacy properties. During our analysis we
discovered new and practical attacks that do not apply to the secondary mode:
an identity injection attack, which violates a central authentication property
of SSO systems, and attacks that break an important privacy promise of
BrowserID and which do not seem to be fixable without a major redesign of the
system. Some of our attacks on privacy make use of a browser side channel that
has not gained a lot of attention so far.
For the authentication bug, we propose a fix and formally prove in a slight
extension of our general web model that the fixed system satisfies all the
requirements we consider. This constitutes the most complex formal analysis of
a web application based on an expressive model of the web infrastructure so
far.
As another contribution, we identify and prove important security properties
of generic web features in the extended web model to facilitate future analysis
efforts of web standards and web applications.Comment: arXiv admin note: substantial text overlap with arXiv:1403.186
The Web SSO Standard OpenID Connect: In-Depth Formal Security Analysis and Security Guidelines
Web-based single sign-on (SSO) services such as Google Sign-In and Log In
with Paypal are based on the OpenID Connect protocol. This protocol enables
so-called relying parties to delegate user authentication to so-called identity
providers. OpenID Connect is one of the newest and most widely deployed single
sign-on protocols on the web. Despite its importance, it has not received much
attention from security researchers so far, and in particular, has not
undergone any rigorous security analysis.
In this paper, we carry out the first in-depth security analysis of OpenID
Connect. To this end, we use a comprehensive generic model of the web to
develop a detailed formal model of OpenID Connect. Based on this model, we then
precisely formalize and prove central security properties for OpenID Connect,
including authentication, authorization, and session integrity properties.
In our modeling of OpenID Connect, we employ security measures in order to
avoid attacks on OpenID Connect that have been discovered previously and new
attack variants that we document for the first time in this paper. Based on
these security measures, we propose security guidelines for implementors of
OpenID Connect. Our formal analysis demonstrates that these guidelines are in
fact effective and sufficient.Comment: An abridged version appears in CSF 2017. Parts of this work extend
the web model presented in arXiv:1411.7210, arXiv:1403.1866,
arXiv:1508.01719, and arXiv:1601.0122
Mapping and Displaying Structural Transformations between XML and PDF
Documents are often marked up in XML-based tagsets to delineate major structural components such as headings, paragraphs, figure captions and so on, without much regard to their eventual displayed appearance. And yet these same abstract documents, after many transformations and 'typesetting' processes, often emerge in the popular format of Adobe PDF, either for dissemination or archiving.
Until recently PDF has been a totally display-based document representation, relying on the underlying PostScript semantics of PDF. Early versions of PDF had no mechanism for retaining any form of abstract document structure but recent releases have now introduced an internal structure tree to create the so called 'Tagged PDF'.
This paper describes the development of a plugin for Adobe Acrobat which creates a two-window display. In one window is shown an XML document original and in the other its Tagged PDF counterpart is seen, with an internal structure tree that, in some sense, matches the one seen in XML. If a component is highlighted in either window then the corresponding structured item, with any attendant text, is also highlighted in the other window.
Important applications of correctly Tagged PDF include making PDF documents reflow intelligently on small screen devices and enabling them to be read out in correct reading order, via speech synthesiser software, for the visually impaired. By tracing structure transformation from source document to destination one can implement the repair of damaged PDF structure or the adaptation of an existing structure tree to an incrementally updated document
Study of Tools Interoperability
Interoperability of tools usually refers to a combination of methods and techniques that address the problem of making a collection of tools to work together. In this study we survey different notions that are used in this context: interoperability, interaction and integration. We point out relation between these notions, and how it maps to the interoperability problem.
We narrow the problem area to the tools development in academia. Tools developed in such environment have a small basis for development, documentation and maintenance. We scrutinise some of the problems and potential solutions related with tools interoperability in such environment. Moreover, we look at two tools developed in the Formal Methods and Tools group1, and analyse the use of different integration techniques
- ā¦