211 research outputs found

    A template-based graph transformation system for the PROV data model

    No full text
    As data provenance becomes a significant metadata in validating the origin of information and asserting its quality, it is crucial to hide the sensitive information of provenance data to enable trustworthiness prior to sharing provenance in open environments such as the Web. In this paper, a graph rewriting system is constructed from the PROV data model to hide restricted provenance information while preserving the integrity and connectivity of the provenance graph. The system is formally established as a template-based framework and formalised using category theory concepts, such as functors, diagrams, and natural transformation

    Information Provenance for Mobile Health Data

    Get PDF
    Mobile health (mHealth) apps and devices are increasingly popular for health research, clinical treatment and personal wellness, as they offer the ability to continuously monitor aspects of individuals\u27 health as they go about their everyday activities. Many believe that combining the data produced by these mHealth apps and devices may give healthcare-related service providers and researchers a more holistic view of an individual\u27s health, increase the quality of service, and reduce operating costs. For such mHealth data to be considered useful though, data consumers need to be assured that the authenticity and the integrity of the data has remained intact---especially for data that may have been created through a series of aggregations and transformations on many input data sets. In other words, information provenance should be one of the main focuses for any system that wishes to facilitate the sharing of sensitive mHealth data. Creating such a trusted and secure data sharing ecosystem for mHealth apps and devices is difficult, however, as they are implemented with different technologies and managed by different organizations. Furthermore, many mHealth devices use ultra-low-power micro-controllers, which lack the kinds of sophisticated Memory Management Units (MMUs) required to sufficiently isolate sensitive application code and data. In this thesis, we present an end-to-end solution for providing information provenance for mHealth data, which begins by securing mHealth data at its source: the mHealth device. To this end, we devise a memory-isolation method that combines compiler-inserted code and Memory Protection Unit (MPU) hardware to protect application code and data on ultra-low-power micro-controllers. Then we address the security of mHealth data outside of the source (e.g., data that has been uploaded to smartphone or remote-server) with our health-data system, Amanuensis, which uses Blockchain and Trusted Execution Environment (TEE) technologies to provide confidential, yet verifiable, data storage and computation for mHealth data. Finally, we look at identity privacy and data freshness issues introduced by the use of blockchain and TEEs. Namely, we present a privacy-preserving solution for blockchain transactions, and a freshness solution for data access-control lists retrieved from the blockchain

    Surveillance Graphs: Vulgarity and Cloud Orthodoxy in Linked Data Infrastructures

    Get PDF
    Information is power, and that power has been largely enclosed by a handful of information conglomerates. The logic of the surveillance-driven information economy demands systems for handling mass quantities of heterogeneous data, increasingly in the form of knowledge graphs. An archaeology of knowledge graphs and their mutation from the liberatory aspirations of the semantic web gives us an underexplored lens to understand contemporary information systems. I explore how the ideology of cloud systems steers two projects from the NIH and NSF intended to build information infrastructures for the public good to inevitable corporate capture, facilitating the development of a new kind of multilayered public/private surveillance system in the process. I argue that understanding technologies like large language models as interfaces to knowledge graphs is critical to understand their role in a larger project of informational enclosure and concentration of power. I draw from multiple histories of liberatory information technologies to develop Vulgar Linked Data as an alternative to the Cloud Orthodoxy, resisting the colonial urge for universality in favor of vernacular expression in peer to peer systems

    Securing Virtualized System via Active Protection

    Get PDF
    Virtualization is the predominant enabling technology of current cloud infrastructure

    Language-based Enforcement of User-defined Security Policies (As Applied to Multi-tier Web Programs)

    Get PDF
    Over the last 35 years, researchers have proposed many different forms of security policies to control how information is managed by software, e.g., multi-level information flow policies, role-based or history-based access control, data provenance management etc. A large body of work in programming language design and analysis has aimed to ensure that particular kinds of security policies are properly enforced by an application. However, these approaches typically fix the style of security policy and overall security goal, e.g., information flow policies with a goal of noninterference. This limits the programmer's ability to combine policy styles and to apply customized enforcement techniques while still being assured the system is secure. This dissertation presents a series of programming-language calculi each intended to verify the enforcement of a range of user-defined security policies. Rather than ``bake in'' the semantics of a particular model of security policy, our languages are parameterized by a programmer-provided specification of the policy and enforcement mechanism (in the form of code). Our approach relies on a novel combination of dependent types to correctly associate security policies with the objects they govern, and affine types to account for policy or program operations that include side effects. We have shown that our type systems are expressive enough to verify the enforcement of various forms of access control, provenance, information flow, and automata-based policies. Additionally, our approach facilitates straightforward proofs that programs implementing a particular policy achieve their high-level security goals. We have proved our languages sound and we have proved relevant security properties for each of the policies we have explored. To our knowledge, no prior framework enables the enforcement of such a wide variety of security policies with an equally high level of assurance. To evaluate the practicality of our solution, we have implemented one of our type systems as part of the Links web-programming language; we call the resulting language SELinks. We report on our experience using SELinks to build two substantial applications, a wiki and an on-line store, equipped with a combination of access control and provenance policies. In general, we have found the mechanisms SELinks provides to be both sufficient and relatively easy to use for many common policies, and that the modular separation of user-defined policy code permitted some reuse between the two applications

    Dictionary of privacy, data protection and information security

    Get PDF
    The Dictionary of Privacy, Data Protection and Information Security explains the complex technical terms, legal concepts, privacy management techniques, conceptual matters and vocabulary that inform public debate about privacy. The revolutionary and pervasive influence of digital technology affects numerous disciplines and sectors of society, and concerns about its potential threats to privacy are growing. With over a thousand terms meticulously set out, described and cross-referenced, this Dictionary enables productive discussion by covering the full range of fields accessibly and comprehensively. In the ever-evolving debate surrounding privacy, this Dictionary takes a longer view, transcending the details of today''s problems, technology, and the law to examine the wider principles that underlie privacy discourse. Interdisciplinary in scope, this Dictionary is invaluable to students, scholars and researchers in law, technology and computing, cybersecurity, sociology, public policy and administration, and regulation. It is also a vital reference for diverse practitioners including data scientists, lawyers, policymakers and regulators

    How to tell stories using visualization: strategies towards narrative visualization

    Get PDF
    Os benefícios da utilização das narrativas são desde há muito conhecidos e o seu potencial para simplificar conceitos, transmitir valores culturais e experiências, criar ligações emocionais e capacidade para ajudar a reter a informação tem sido explorado em diferentes áreas. As narrativas não são só a principal forma como as pessoas obtêm o sentido do mundo, mas também a forma mais fácil que encontrámos para partilhar informações complexas. Devido ao seu potencial, as narrativas foram recentemente abordadas na área da Visualização de Informação e do Conhecimento, muitas vezes apelidada de Visualização Narrativa. Esta questão é particularmente importante para os media, uma das áreas que tem impulsionado a investigação em Visualização Narrativa. A necessidade de incorporar histórias nas visualizações surge da necessidade de partilhar dados complexos de um modo envolvente. Hoje em dia somos confrontados com a elevada quantidade de informação disponível, um desafio difícil de resolver. Os avanços da tecnologia permitiram ir além das formas tradicionais de narrativa e de representação de dados, dando-nos meios mais atraentes e sofisticados para contar histórias. Nesta tese, exploro os benefícios da introdução de narrativas nas visualizações. Adicionalmente também exploro formas de combinar histórias com a visualizações e métodos eficientes para representar e dar sentido aos dados de uma forma que permite que as pessoas se relacionem com a informação. Esta investigação está bastante próxima da área do jornalismo, no entanto estas técnicas podem ser aplicadas em diferente áreas (educação, visualização científica, etc.). Para explorar ainda mais este tema foi adotada um avaliação que utiliza diferentes metodologias como a tipologia, vários casos de estudo, um estudo com grupos de foco, e ainda estudos de design e análise de técnicas.The benefits of storytelling are long-known and its potential to simplify concepts, convey cultural values and experiences, create emotional connection, and capacity to help retain information has been explored in di erent areas, such as journalism, education, marketing, and others. Narratives not only have been the main way people make sense of the world, but also the easiest way humans found out to share complex information. Due to its potential narratives have also recently been approached in the area of Information and Knowledge Visualization, several times being referred to as Narrative Visualization. This matter is also particularly important for news media, one of the areas that has been pushing the research on Narrative Visualization. The necessity to incorporate storytelling in visualizations arises from the need to share complex data in a way that is engaging. Nowadays we also have the challenge of the high amount of information available, which can be hard to cope with. Advances in technology have enabled us to go beyond the traditional forms of storytelling and representing data, giving us more attractive and sophisticated means to tell stories. In this dissertation, I explore the benefits of infusing visualizations with narratives. In addition I also present ways of combining storytelling with visualization and e cient methods to represent and make sense of data in a way that allows people to relate with the information. This research is closely related to journalism, but these techniques can be applied to completely di erent areas (education, scientific visualization, etc.). To further explore this topic a mixedmethod evaluation that consists of a typology, several case studies and a focus group study was chosen, as well as design studies and techniques review. This dissertation is intended to contribute to the evolving understanding of the field of narrative visualization

    The occurrence and preservation of diatoms in the Palaeogene of the North Sea Basin

    Get PDF
    The often widespread occurrence of diatoms in the marine sediments of the North Sea Palaeogene has long been recognised. They occur in abundance through a number of intervals where calcareous microfossils are absent (due to palaeoenvironmental conditions and subsequent dissolution). However, poor preservation has previously impeded the taxonomic identification of these diatom assemblages, with most specimens occurring as pyritised inner moulds (steinkerns). This study has involved the first detailed description of these assemblages, which was achieved through the use of electron microscopy combined with comparisons with well-preserved specimens, and a survey of original species descriptions held in the Natural History Museum. These techniques have enabled the identification of a total of 79 species, 40 of which have not previously been formally described in pyritised form. Material analysed in this study (including samples from exploration wells and coeval onshore sections around the North Sea Basin) has led to the recognition of a number of diatom events which broadly form three major assemblages through the North Sea Palaeocene sequence. The lowermost is the most diverse, occurring within the volcaniclastic Sele and Balder formations and their onshore equivalents around the Paleocene/Eocene boundary interval. The relationship of abundant diatomaceous deposits to vulcanicity during this interval is discussed, together with other factors (including increased nutrient levels) encouraging the proliferation of diatoms. A later, less diverse assemblage in the mid Eocene includes more cosmopolitan species; above this is a distinctive Oligocene to mid Miocene assemblage. The state of preservation of diatom assemblages varies markedly around the North Sea Basin; this has been discussed and microprobe analyses conducted. A number of taxonomic revisions of previously published species (both pyritised and non-pyritised) have also been carried out, including translations of descriptions into English (and their emendment where necessary). A new genus, Cylindrospira (consisting of two species, C. simsi and C. homanni) is described which has no living representatives, but has features found in both extinct and extant genera. It is palaeoenvironmentally significant, occurring in a brackish facies of the Fur Formation diatomite, age-equivalent to one of the main diatomaceous intervals in the North Sea. Prior to this study, only fully marine diatoms had been documented from the Paleocene. Existing microfossil zonation schemes for the North Sea Palaeogene have been refined, by integrating diatom events with those of stratigraphically well-defined fossil groups such as foraminifera and silicoflagellates. This has ebabled their correlation with other sections, and an improved understanding of palaeocirculation changes through the North Sea Palaeogene

    Evidence-based Cybersecurity: Data-driven and Abstract Models

    Get PDF
    Achieving computer security requires both rigorous empirical measurement and models to understand cybersecurity phenomena and the effectiveness of defenses and interventions. To address the growing scale of cyber-insecurity, my approach to protecting users employs principled and rigorous measurements and models. In this dissertation, I examine four cybersecurity phenomena. I show that data-driven and abstract modeling can reveal surprising conclusions about longterm, persistent problems, like spam and malware, and growing threats like data-breaches and cyber conflict. I present two data-driven statistical models and two abstract models. Both of the data-driven models show that the presence of heavy-tailed distributions can make naive analysis of trends and interventions misleading. First, I examine ten years of publicly reported data breaches and find that there has been no increase in size or frequency. I also find that reported and perceived increases can be explained by the heavy-tailed nature of breaches. In the second data-driven model, I examine a large spam dataset, analyzing spam concentrations across Internet Service Providers. Again, I find that the heavy-tailed nature of spam concentrations complicates analysis. Using appropriate statistical methods, I identify unique risk factors with significant impact on local spam levels. I then use the model to estimate the effect of historical botnet takedowns and find they are frequently ineffective at reducing global spam concentrations and have highly variable local effects. Abstract models are an important tool when data are unavailable. Even without data, I evaluate both known and hypothesized interventions used by search providers to protect users from malicious websites. I present a Markov model of malware spread and study the effect of two potential interventions: blacklisting and depreferencing. I find that heavy-tailed traffic distributions obscure the effects of interventions, but with my abstract model, I showed that lowering search rankings is a viable alternative to blacklisting infected pages. Finally, I study how game-theoretic models can help clarify strategic decisions in cyber-conflict. I find that, in some circumstances, improving the attribution ability of adversaries may decrease the likelihood of escalating cyber conflict
    corecore