404 research outputs found

    Synthetic Data Generation for the Internet of Things

    Get PDF
    The concept of Internet of Things (IoT) is rapidly moving from a vision to being pervasive in our everyday lives. This can be observed in the integration of connected sensors from a multitude of devices such as mobile phones, healthcare equipment, and vehicles. There is a need for the development of infrastructure support and analytical tools to handle IoT data, which are naturally big and complex. But, research on IoT data can be constrained by concerns about the release of privately owned data. In this paper, we present the design and implementation results of a synthetic IoT data generation framework. The framework enables research on synthetic data that exhibit the complex characteristics of original data without compromising proprietary information and personal privacy

    Symbolic Side-Channel Analysis for Probabilistic Programs

    Get PDF
    In this paper we describe symbolic side-channel analysis techniques for detecting and quantifying information leakage, given in terms of Shannon and Min Entropy. Measuring the precise leakage is challenging due to the randomness and noise often present in program executions and side-channel observations. We account for this noise by introducing additional (symbolic) program inputs which are interpreted probabilistically, using symbolic execution with parameterized model counting. We also explore an approximate sampling approach for increased scalability. In contrast to typical Monte Carlo techniques, our approach works by sampling symbolic paths, representing multiple concrete paths, and uses pruning to accelerate computation and guarantee convergence to the optimal results. The key novelty of our approach is to provide bounds on the leakage that are provably under- and over-approximating the real leakage. We implemented the techniques in the Symbolic PathFinder tool and we demonstrate them on Java programs

    Stratified Abstraction of Access Control Policies

    Get PDF
    The shift to cloud-based APIs has made application security critically depend on understanding and reasoning about policies that regulate access to cloud resources. We present stratified predicate abstraction, a new approach that summarizes complex security policies into a compact set of positive and declarative statements that precisely state who has access to a resource. We have implemented stratified abstraction and deployed it as the engine powering AWS’s IAM Access Analyzer service, and hence, demonstrate how formal methods and SMT can be used for security policy explanation

    Hybrid capture of 964 nuclear genes resolves evolutionary relationships in the mimosoid legumes and reveals the polytomous origins of a large pantropical radiation

    Get PDF
    PREMISE Targeted enrichment methods facilitate sequencing of hundreds of nuclear loci to enhance phylogenetic resolution and elucidate why some parts of the “tree of life” are difficult (if not impossible) to resolve. The mimosoid legumes are a prominent pantropical clade of ~3300 species of woody angiosperms for which previous phylogenies have shown extensive lack of resolution, especially among the species‐rich and taxonomically challenging ingoids. METHODS We generated transcriptomes to select low‐copy nuclear genes, enrich these via hybrid capture for representative species of most mimosoid genera, and analyze the resulting data using de novo assembly and various phylogenomic tools for species tree inference. We also evaluate gene tree support and conflict for key internodes and use phylogenetic network analysis to investigate phylogenetic signal across the ingoids. RESULTS Our selection of 964 nuclear genes greatly improves phylogenetic resolution across the mimosoid phylogeny and shows that the ingoid clade can be resolved into several well‐supported clades. However, nearly all loci show lack of phylogenetic signal for some of the deeper internodes within the ingoids. CONCLUSIONS Lack of resolution in the ingoid clade is most likely the result of hyperfast diversification, potentially causing a hard polytomy of six or seven lineages. The gene set for targeted sequencing presented here offers great potential to further enhance the phylogeny of mimosoids and the wider Caesalpinioideae with denser taxon sampling, to provide a framework for taxonomic reclassification, and to study the ingoid radiation

    Improved Offshore Wind Resource Assessment in Global Climate Stabilization Scenarios

    Get PDF
    This paper introduces a technique for digesting geospatial wind-speed data into areally defined -- country-level, in this case -- wind resource supply curves. We combined gridded wind-vector data for ocean areas with bathymetry maps, country exclusive economic zones, wind turbine power curves, and other datasets and relevant parameters to build supply curves that estimate a country's offshore wind resource defined by resource quality, depth, and distance-from-shore. We include a single set of supply curves -- for a particular assumption set -- and study some implications of including it in a global energy model. We also discuss the importance of downscaling gridded wind vector data to capturing the full resource potential, especially over land areas with complex terrain. This paper includes motivation and background for a statistical downscaling methodology to account for terrain effects with a low computational burden. Finally, we use this forum to sketch a framework for building synthetic electric networks to estimate transmission accessibility of renewable resource sites in remote areas

    A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts.</p> <p>Results</p> <p>To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (<it>e.g</it>., for biomolecular sequences, alignments, structures) and functionality (<it>e.g</it>., to parse/write standard file formats).</p> <p>Conclusions</p> <p>PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at <url>http://muralab.org/PaPy</url>, and includes extensive documentation and annotated usage examples.</p

    Silkworm expression system as a platform technology in life science

    Get PDF
    Many recombinant proteins have been successfully produced in silkworm larvae or pupae and used for academic and industrial purposes. Several recombinant proteins produced by silkworms have already been commercialized. However, construction of a recombinant baculovirus containing a gene of interest requires tedious and troublesome steps and takes a long time (3–6 months). The recent development of a bacmid, Escherichia coli and Bombyx mori shuttle vector, has eliminated the conventional tedious procedures required to identify and isolate recombinant viruses. Several technical improvements, including a cysteine protease or chitinase deletion bacmid and chaperone-assisted expression and coexpression, have led to significantly increased protein yields and reduced costs for large-scale production. Terminal N-acetyl glucosamine and galactose residues were found in the N-glycan structures produced by silkworms, which are different from those generated by insect cells. Genomic elucidation of silkworm has opened a new chapter in utilization of silkworm. Transgenic silkworm technology provides a stable production of recombinant protein. Baculovirus surface display expression is one of the low-cost approaches toward silkworm larvae-derived recombinant subunit vaccines. The expression of pharmaceutically relevant proteins, including cell/viral surface proteins, membrane proteins, and guanine nucleotide-binding protein (G protein) coupled receptors, using silkworm larvae or cocoons has become very attractive. Silkworm biotechnology is an innovative and easy approach to achieve high protein expression levels and is a very promising platform technology in the field of life science. Like the “Silkroad,” we expect that the “Bioroad” from Asia to Europe will be established by the silkworm expression system

    Cooperative Regulation of the Activity of Factor Xa within Prothrombinase by Discrete Amino Acid Regions from Factor Va Heavy Chain†

    Get PDF
    ABSTRACT: The prothrombinase complex catalyzes the activation of prothrombin to R-thrombin. We have repetitively shown that amino acid region 695DYDY698 from the COOH terminus of the heavy chain of factor Va regulates the rate of cleavage of prothrombin at Arg271 by prothrombinase. We have also recently demonstrated that amino acid region 334DY335 is required for the optimal activity of prothrombinase. To assess the effect of these six amino acid residues on cofactor activity, we created recombinant factor Va molecules combining mutations at amino acid regions 334–335 an
    corecore