11 research outputs found

    In silico optimization of mass spectrometry fragmentation strategies in metabolomics

    Get PDF
    Liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) is widely used in identifying small molecules in untargeted metabolomics. Various strategies exist to acquire MS/MS fragmentation spectra; however, the development of new acquisition strategies is hampered by the lack of simulators that let researchers prototype, compare, and optimize strategies before validations on real machines. We introduce Virtual Metabolomics Mass Spectrometer (ViMMS), a metabolomics LC-MS/MS simulator framework that allows for scan-level control of the MS2 acquisition process in silico. ViMMS can generate new LC-MS/MS data based on empirical data or virtually re-run a previous LC-MS/MS analysis using pre-existing data to allow the testing of different fragmentation strategies. To demonstrate its utility, we show how ViMMS can be used to optimize N for Top-N data-dependent acquisition (DDA) acquisition, giving results comparable to modifying N on the mass spectrometer. We expect that ViMMS will save method development time by allowing for offline evaluation of novel fragmentation strategies and optimization of the fragmentation strategy for a particular experiment

    Advanced Techniques for Improving the Efficacy of Digital Forensics Investigations

    Get PDF
    Digital forensics is the science concerned with discovering, preserving, and analyzing evidence on digital devices. The intent is to be able to determine what events have taken place, when they occurred, who performed them, and how they were performed. In order for an investigation to be effective, it must exhibit several characteristics. The results produced must be reliable, or else the theory of events based on the results will be flawed. The investigation must be comprehensive, meaning that it must analyze all targets which may contain evidence of forensic interest. Since any investigation must be performed within the constraints of available time, storage, manpower, and computation, investigative techniques must be efficient. Finally, an investigation must provide a coherent view of the events under question using the evidence gathered. Unfortunately the set of currently available tools and techniques used in digital forensic investigations does a poor job of supporting these characteristics. Many tools used contain bugs which generate inaccurate results; there are many types of devices and data for which no analysis techniques exist; most existing tools are woefully inefficient, failing to take advantage of modern hardware; and the task of aggregating data into a coherent picture of events is largely left to the investigator to perform manually. To remedy this situation, we developed a set of techniques to facilitate more effective investigations. To improve reliability, we developed the Forensic Discovery Auditing Module, a mechanism for auditing and enforcing controls on accesses to evidence. To improve comprehensiveness, we developed ramparser, a tool for deep parsing of Linux RAM images, which provides previously inaccessible data on the live state of a machine. To improve efficiency, we developed a set of performance optimizations, and applied them to the Scalpel file carver, creating order of magnitude improvements to processing speed and storage requirements. Last, to facilitate more coherent investigations, we developed the Forensic Automated Coherence Engine, which generates a high-level view of a system from the data generated by low-level forensics tools. Together, these techniques significantly improve the effectiveness of digital forensic investigations conducted using them

    A practical guide to interpreting and generating bottom-up proteomics data visualizations

    Get PDF
    Mass-spectrometry based bottom-up proteomics is the main method to analyze proteomes comprehensively and the rapid evolution of instrumentation and data analysis has made the technology widely available. Data visualization is an integral part of the analysis process and it is crucial for the communication of results. This is a major challenge due to the immense complexity of MS data. In this review, we provide an overview of commonly used visualizations, starting with raw data of traditional and novel MS technologies, then basic peptide and protein level analyses, and finally visualization of highly complex datasets and networks. We specifically provide guidance on how to critically interpret and discuss the multitude of different proteomics data visualizations. Furthermore, we highlight Python-based libraries and other open science tools that can be applied for independent and transparent generation of customized visualizations. To further encourage programmatic data visualization, we provide the Python code used to generate all data figures in this review on GitHub ().DATA AVAILABILITY STATEMENT Proteomics data from the following ProteomeExchange repositories were reused to generate Figures in this study: PXD012867, PXD017703, PXD010697, PXD010103

    Advanced Cryptographic Techniques for Protecting Log Data

    Get PDF
    This thesis examines cryptographic techniques providing security for computer log files. It focuses on ensuring authenticity and integrity, i.e. the properties of having been created by a specific entity and being unmodified. Confidentiality, the property of being unknown to unauthorized entities, will be considered, too, but with less emphasis. Computer log files are recordings of actions performed and events encountered in computer systems. While the complexity of computer systems is steadily growing, it is increasingly difficult to predict how a given system will behave under certain conditions, or to retrospectively reconstruct and explain which events and conditions led to a specific behavior. Computer log files help to mitigate the problem of retracing a system’s behavior retrospectively by providing a (usually chronological) view of events and actions encountered in a system. Authenticity and integrity of computer log files are widely recognized security requirements, see e.g. [Latham, ed., "Department of Defense Trusted Computer System Evaluation Criteria", 1985, p. 10], [Kent and Souppaya, "Guide to Computer Security Log Management", NIST Special Publication 800-92, 2006, Section 2.3.2], [Guttman and Roback, "An Introduction to Computer Security: The NIST Handbook", superseded NIST Special Publication 800-12, 1995, Section 18.3.1], [Nieles et al., "An Introduction to Information Security" , NIST Special Publication 800-12, 2017, Section 9.3], [Common Criteria Editorial Board, ed., "Common Criteria for Information Technology Security Evaluation", Part 2, Section 8.6]. Two commonly cited ways to ensure integrity of log files are to store log data on so-called write-once-read-many-times (WORM) drives and to immediately print log records on a continuous-feed printer. This guarantees that log data cannot be retroactively modified by an attacker without physical access to the storage medium. However, such special-purpose hardware may not always be a viable option for the application at hand, for example because it may be too costly. In such cases, the integrity and authenticity of log records must be ensured via other means, e.g. with cryptographic techniques. Although these techniques cannot prevent the modification of log data, they can offer strong guarantees that modifications will be detectable, while being implementable in software. Furthermore, cryptography can be used to achieve public verifiability of log files, which may be needed in applications that have strong transparency requirements. Cryptographic techniques can even be used in addition to hardware solutions, providing protection against attackers who do have physical access to the logging hardware, such as insiders. Cryptographic schemes for protecting stored log data need to be resilient against attackers who obtain control over the computer storing the log data. If this computer operates in a standalone fashion, it is an absolute requirement for the cryptographic schemes to offer security even in the event of a key compromise. As this is impossible with standard cryptographic tools, cryptographic solutions for protecting log data typically make use of forward-secure schemes, guaranteeing that changes to log data recorded in the past can be detected. Such schemes use a sequence of authentication keys instead of a single one, where previous keys cannot be computed efficiently from latter ones. This thesis considers the following requirements for, and desirable features of, cryptographic logging schemes: 1) security, i.e. the ability to reliably detect violations of integrity and authenticity, including detection of log truncations, 2) efficiency regarding both computational and storage overhead, 3) robustness, i.e. the ability to verify unmodified log entries even if others have been illicitly changed, and 4) verifiability of excerpts, including checking an excerpt for omissions. The goals of this thesis are to devise new techniques for the construction of cryptographic schemes that provide security for computer log files, to give concrete constructions of such schemes, to develop new models that can accurately capture the security guarantees offered by the new schemes, as well as to examine the security of previously published schemes. This thesis demands that cryptographic schemes for securely storing log data must be able to detect if log entries have been deleted from a log file. A special case of deletion is log truncation, where a continuous subsequence of log records from the end of the log file is deleted. Obtaining truncation resistance, i.e. the ability to detect truncations, is one of the major difficulties when designing cryptographic logging schemes. This thesis alleviates this problem by introducing a novel technique to detect log truncations without the help of third parties or designated logging hardware. Moreover, this work presents new formal security notions capturing truncation resistance. The technique mentioned above is applied to obtain cryptographic logging schemes which can be shown to satisfy these notions under mild assumptions, making them the first schemes with formally proven truncation security. Furthermore, this thesis develops a cryptographic scheme for the protection of log files which can support the creation of excerpts. For this thesis, an excerpt is a (not necessarily contiguous) subsequence of records from a log file. Excerpts created with the scheme presented in this thesis can be publicly checked for integrity and authenticity (as explained above) as well as for completeness, i.e. the property that no relevant log entry has been omitted from the excerpt. Excerpts provide a natural way to preserve the confidentiality of information that is contained in a log file, but not of interest for a specific public analysis of the log file, enabling the owner of the log file to meet confidentiality and transparency requirements at the same time. The scheme demonstrates and exemplifies the technique for obtaining truncation security mentioned above. Since cryptographic techniques to safeguard log files usually require authenticating log entries individually, some researchers [Ma and Tsudik, "A New Approach to Secure Logging", LNCS 5094, 2008; Ma and Tsudik, "A New Approach to Secure Logging", ACM TOS 2009; Yavuz and Peng, "BAF: An Efficient Publicly Verifiable Secure Audit Logging Scheme for Distributed Systems", ACSAC 2009] have proposed using aggregatable signatures [Boneh et al., "Aggregate and Verifiably Encrypted Signatures from Bilinear Maps", EUROCRYPT 2003] in order to reduce the overhead in storage space incurred by using such a cryptographic scheme. Aggregation of signatures refers to some “combination” of any number of signatures (for distinct or equal messages, by distinct or identical signers) into an “aggregate” signature. The size of the aggregate signature should be less than the total of the sizes of the orginal signatures, ideally the size of one of the original signatures. Using aggregation of signatures in applications that require storing or transmitting a large number of signatures (such as the storage of log records) can lead to significant reductions in the use of storage space and bandwidth. However, aggregating the signatures for all log records into a single signature will cause some fragility: The modification of a single log entry will render the aggregate signature invalid, preventing the cryptographic verification of any part of the log file. However, being able to distinguish manipulated log entries from non-manipulated ones may be of importance for after-the-fact investigations. This thesis addresses this issue by presenting a new technique providing a trade-off between storage overhead and robustness, i.e. the ability to tolerate some modifications to the log file while preserving the cryptographic verifiability of unmodified log entries. This robustness is achieved by the use of a special kind of aggregate signatures (called fault-tolerant aggregate signatures), which contain some redundancy. The construction makes use of combinatorial methods guaranteeing that if the number of errors is below a certain threshold, then there will be enough redundancy to identify and verify the non-modified log entries. Finally, this thesis presents a total of four attacks on three different schemes intended for securely storing log files presented in the literature [Yavuz et al., "Efficient, Compromise Resilient and Append-Only Cryptographic Schemes for Secure Audit Logging", Financial Cryptography 2012; Ma, "Practical Forward Secure Sequential Aggregate Signatures", ASIACCS 2008]. The attacks allow for virtually arbitrary log file forgeries or even recovery of the secret key used for authenticating the log file, which could then be used for mostly arbitrary log file forgeries, too. All of these attacks exploit weaknesses of the specific schemes. Three of the attacks presented here contradict the security properties of the schemes claimed and supposedly proven by the respective authors. This thesis briefly discusses these proofs and points out their flaws. The fourth attack presented here is outside of the security model considered by the scheme’s authors, but nonetheless presents a realistic threat. In summary, this thesis advances the scientific state-of-the-art with regard to providing security for computer log files in a number of ways: by introducing a new technique for obtaining security against log truncations, by providing the first scheme where excerpts from log files can be verified for completeness, by describing the first scheme that can achieve some notion of robustness while being able to aggregate log record signatures, and by analyzing the security of previously proposed schemes

    Intelligent detectors

    Get PDF
    Die vorliegende Arbeit stellt eine Basis zur Entwicklung von On-Board Software fĂŒr astronomische Satelliten dar. Sie dient als Anleitung und Nachschlagewerk und zeigt anhand der Projekte Herschel/PACS und SPICA/SAFARI, wie aus den Grundlagen weltraumtaugliche Flugsoftware entsteht. Dazu gehören das Verstehen des wissenschaftlichen Zwecks, also was soll wie gemessen werden und wofĂŒr ist das gut, sowie die Kenntnis der physikalischen Eigenschaften des Detektors, das Beherrschen der mathematischen Operationen zur Verarbeitung der Daten und natĂŒrlich auch die BerĂŒcksichtigung der UmstĂ€nde, unter welchen der Detektor zum Einsatz kommt.This thesis contains the knowledge and a good deal of experience that are necessary for the development of such astronomical on-board software for satellites. The key elements in the development are the understanding of the scientific purpose, knowledge of the physical properties of the detector, the comprehension of the mathematical operations involved in data processing and the consideration of the technical and observational circumstances

    An Evaluation of Forensic Tools for Linux : Emphasizing EnCase and PyFlag

    Get PDF
    Denne masteroppgaven gir en vurdering og sammenligning av flere datakriminaltekniske verktÞy, med et spesielt fokus pÄ to spesifikke verktÞy. Det fÞrste kalles EnCase Forensics og er et kommersielt tilgjengelig verktÞy som blir benyttet av politi og myndigheter flere steder i verden. Det andre kalles PyFlag og er et open source alternativ som ble benyttet i det vinnende bidraget til Digital Forensics Research Workshop (DFRWS) i 2008. Selv om verktÞyene blir evaluert i sin helhet, vil hovedfokuset ligge pÄ viktig sÞkefunksjonalitet. Tatt i betraktning at mesteparten av forskningen innen omrÄdet er basert pÄ Microsoft Windows plattformen, mens mindre forskning har blitt utfÞrt angÄende analyse av Linux systemer, sÄ undersÞker vi disse verktÞyene hovedsakelig i et Linux miljÞ. Med disse verktÞyene utfÞrer vi datakriminalteknisk utvinning og analyse av realistiske data. I tillegg benyttes et verktÞy med navn dd, for Ä utvinne data fra Linux. Denne masteroppgaven inneholder spesifiserte testprosedyrer, problemer vi stÞtte pÄ under selve testingen, og de endelige resultatene

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    Prefetching and Caching Techniques in File Systems for Mimd Multiprocessors

    Get PDF
    The increasing speed of the most powerful computers, especially multiprocessors, makes it difficult to provide sufficient I/O bandwidth to keep them running at full speed for the largest problems. Trends show that the difference in the speed of disk hardware and the speed of processors is increasing, with I/O severely limiting the performance of otherwise fast machines. This widening access-time gap is known as the “I/O bottleneck crisis.” One solution to the crisis, suggested by many researchers, is to use many disks in parallel to increase the overall bandwidth. \par This dissertation studies some of the file system issues needed to get high performance from parallel disk systems, since parallel hardware alone cannot guarantee good performance. The target systems are large MIMD multiprocessors used for scientific applications, with large files spread over multiple disks attached in parallel. The focus is on automatic caching and prefetching techniques. We show that caching and prefetching can transparently provide the power of parallel disk hardware to both sequential and parallel applications using a conventional file system interface. We also propose a new file system interface (compatible with the conventional interface) that could make it easier to use parallel disks effectively. \par Our methodology is a mixture of implementation and simulation, using a software testbed that we built to run on a BBN GP1000 multiprocessor. The testbed simulates the disks and fully implements the caching and prefetching policies. Using a synthetic workload as input, we use the testbed in an extensive set of experiments. The results show that prefetching and caching improved the performance of parallel file systems, often dramatically

    Exploring the untargeted synthesis of prebiotically-plausible molecules

    Get PDF
    One of the biggest challenges we face when studying the Origins of Life (OoL) is that in the absence of a time-machine, it is not possible to make direct observations about what actually happened on early earth. Recently, a more ‘systems’ approach has been taken on, which looks for new phenomena and is not constrained by the search of particular products. Investigations of prebiotic complex chemical networks are increasingly tailored towards the elucidation of which environmental conditions are capable of ‘tuning’ the product distribution towards a greater degree of complexity. For this reason, a series of classic Miller-Urey experiments were conducted alongside with all-deuterated Miller-Urey experiments to explore the effect a ‘heavier’ isotope in the resulting chemical space of the complex mixture. Previous work in prebiotic chemistry has demonstrated that the inclusion of mineral surfaces in complex reaction networks, can effectively steer the product distribution into a particular product. In order to address this, we carried out the Formose reaction in a mixture of water and Formamide (50:50 v/v) and investigated how different environmental inputs (such as mineral surfaces and reaction cycling) can affect the reaction, by steering it into a particular outcome. Also, inspired by the metabolomics workflows designed for metabolite discovery, we conducted UPLC-MS/MS in a Data-Dependent fashion, which allows for features to be generated in a confident manner with each one representing a product within the complex product distribution and mapping the resulting chemical space of the products. Finally, in the case of the Miller-Urey experiment, few versions have been carried out so far (i.e. besides variations within the energy source used in the experiment or the gas mixture employed). Therefore, this prompted us to investigate the effect of reaction cycling in the Miller-Urey reaction. The effect of natural processes such as atmospheric cycling, is an important but not yet addressed variable within the prebiotic broths framework. Therefore, we decided to investigate what effect could this have in the overall product distribution of the famous experiment

    Metabolomics Data Processing and Data Analysis—Current Best Practices

    Get PDF
    Metabolomics data analysis strategies are central to transforming raw metabolomics data files into meaningful biochemical interpretations that answer biological questions or generate novel hypotheses. This book contains a variety of papers from a Special Issue around the theme “Best Practices in Metabolomics Data Analysis”. Reviews and strategies for the whole metabolomics pipeline are included, whereas key areas such as metabolite annotation and identification, compound and spectral databases and repositories, and statistical analysis are highlighted in various papers. Altogether, this book contains valuable information for researchers just starting in their metabolomics career as well as those that are more experienced and look for additional knowledge and best practice to complement key parts of their metabolomics workflows
    corecore