1,196 research outputs found

    KeyForge: Mitigating Email Breaches with Forward-Forgeable Signatures

    Full text link
    Email breaches are commonplace, and they expose a wealth of personal, business, and political data that may have devastating consequences. The current email system allows any attacker who gains access to your email to prove the authenticity of the stolen messages to third parties -- a property arising from a necessary anti-spam / anti-spoofing protocol called DKIM. This exacerbates the problem of email breaches by greatly increasing the potential for attackers to damage the users' reputation, blackmail them, or sell the stolen information to third parties. In this paper, we introduce "non-attributable email", which guarantees that a wide class of adversaries are unable to convince any third party of the authenticity of stolen emails. We formally define non-attributability, and present two practical system proposals -- KeyForge and TimeForge -- that provably achieve non-attributability while maintaining the important protection against spam and spoofing that is currently provided by DKIM. Moreover, we implement KeyForge and demonstrate that that scheme is practical, achieving competitive verification and signing speed while also requiring 42% less bandwidth per email than RSA2048

    A Multitude of Linguistically-rich Features for Authorship Attribution

    Get PDF
    International audienceThis paper reports on the procedure and learning models we adopted for the 'PAN 2011 Author Identification' challenge targetting real-world email messages. The novelty of our approach lies in a design which combines shallow characteristics of the emails (words and trigrams frequencies) with a large number of ad hoc linguistically-rich features addressing different language levels. For the author attribution tasks, all these features were used to train a maximum entropy model which gave very good results. For the single author verification tasks, a set of features exclusively based on the linguistic description of the emails' messages was considered as input for symbolic learning techniques (rules and decision trees), and gave weak results. This paper presents in detail the features extracted from the corpus, the learning models and the results obtained

    Development of an iPhone business application

    Get PDF
    The smartphones of today more and more have all the abilities of mobile computers. In fact they are small computers with powerful processors, abundant memory and ubiquitous internet access. Nonetheless they are restricted in terms of display resolution and size, battery life and input options. This raises the discussion whether smartphones could be used for business applications in the real world. The advantages and possibilities seam obvious. Smartphones with their ubiquitous network connections can easily be carried around and could be used as clients in distributed systems or might even run their own applications independently anywhere. Nowadays the iPhone is one of the most advanced smartphones on the market. It is equipped with a 600MHz processor, up to 512Mb embedded RAM and a flash drive with a maximum volume of 32Gb to maintain applications and data. In this diploma thesis we want to analyse if the iPhone with its abilities and restrictions can provide enough resources and input options for business applications. Therefore we will port Harzing' s Publish or Perish, a desktop application which parses google scholar data for author impact analysis, to the iPhone. Different issues like parsing techniques, data processing and storing shall be compared and discussed. Furthermore the multi-touch screen as an input device and the restricted screen size should be studied. The aim of this diploma thesis is to gain knowledge about how to design iPhone applications for business use in terms of architecture and user interface design

    Meaning-based machine learning for information assurance

    Get PDF
    AbstractThis paper presents meaning-based machine learning, the use of semantically meaningful input data into machine learning systems in order to produce output that is meaningful to a human user where the semantic input comes from the Ontological Semantics Technology theory of natural language processing. How to bridge from knowledge-based natural language processing architectures to traditional machine learning systems is described to include high-level descriptions of the steps taken. These meaning-based machine learning systems are then applied to problems in information assurance and security that remain unsolved and feature large amounts of natural language text

    Web Mail Information Extraction

    Get PDF
    This project is conducted as to deliver the background of study, problem statements, objective, scope, literature review, methodology of choice for the development process, results and discussion, conclusion, recommendations and references used throughout its completion. The objective of this project is to extract relevant and useful information from Google Mail (GMail) by performing Information Extraction (IE) using Java progranuning language. After several testing have take place, the system developed is able to successfully extract relevant and useful information from GMail account and the emails come from different folders such as All Mail, Inbox, Drafts, Starred, Sent Mail, Spam and Trash. The focus is to extract email information such as the sender, recipient, subject and content. Those extracted information are presented in two mediums; as a text file or being stored inside database in order to better suit different users who come from different backgrounds and needs

    Protecting Systems From Exploits Using Language-Theoretic Security

    Get PDF
    Any computer program processing input from the user or network must validate the input. Input-handling vulnerabilities occur in programs when the software component responsible for filtering malicious input---the parser---does not perform validation adequately. Consequently, parsers are among the most targeted components since they defend the rest of the program from malicious input. This thesis adopts the Language-Theoretic Security (LangSec) principle to understand what tools and research are needed to prevent exploits that target parsers. LangSec proposes specifying the syntactic structure of the input format as a formal grammar. We then build a recognizer for this formal grammar to validate any input before the rest of the program acts on it. To ensure that these recognizers represent the data format, programmers often rely on parser generators or parser combinators tools to build the parsers. This thesis propels several sub-fields in LangSec by proposing new techniques to find bugs in implementations, novel categorizations of vulnerabilities, and new parsing algorithms and tools to handle practical data formats. To this end, this thesis comprises five parts that tackle various tenets of LangSec. First, I categorize various input-handling vulnerabilities and exploits using two frameworks. First, I use the mismorphisms framework to reason about vulnerabilities. This framework helps us reason about the root causes leading to various vulnerabilities. Next, we built a categorization framework using various LangSec anti-patterns, such as parser differentials and insufficient input validation. Finally, we built a catalog of more than 30 popular vulnerabilities to demonstrate the categorization frameworks. Second, I built parsers for various Internet of Things and power grid network protocols and the iccMAX file format using parser combinator libraries. The parsers I built for power grid protocols were deployed and tested on power grid substation networks as an intrusion detection tool. The parser I built for the iccMAX file format led to several corrections and modifications to the iccMAX specifications and reference implementations. Third, I present SPARTA, a novel tool I built that generates Rust code that type checks Portable Data Format (PDF) files. The type checker I helped build strictly enforces the constraints in the PDF specification to find deviations. Our checker has contributed to at least four significant clarifications and corrections to the PDF 2.0 specification and various open-source PDF tools. In addition to our checker, we also built a practical tool, PDFFixer, to dynamically patch type errors in PDF files. Fourth, I present ParseSmith, a tool to build verified parsers for real-world data formats. Most parsing tools available for data formats are insufficient to handle practical formats or have not been verified for their correctness. I built a verified parsing tool in Dafny that builds on ideas from attribute grammars, data-dependent grammars, and parsing expression grammars to tackle various constructs commonly seen in network formats. I prove that our parsers run in linear time and always terminate for well-formed grammars. Finally, I provide the earliest systematic comparison of various data description languages (DDLs) and their parser generation tools. DDLs are used to describe and parse commonly used data formats, such as image formats. Next, I conducted an expert elicitation qualitative study to derive various metrics that I use to compare the DDLs. I also systematically compare these DDLs based on sample data descriptions available with the DDLs---checking for correctness and resilience
    • …
    corecore