21 research outputs found

    On the sensitivity of some APN permutations to swapping points

    Get PDF
    We define a set called the pAPN-spectrum of an (n,n)(n,n)-function FF, which measures how close FF is to being an APN function, and investigate how the size of the pAPN-spectrum changes when two of the outputs of a given FF are swapped. We completely characterize the behavior of the pAPN-spectrum under swapping outputs when F(x)=x2n2F(x) = x^{2^n-2} is the inverse function over F2n\mathbb{F}_{2^n}. We also investigate this behavior for functions from the Gold and Welch monomial APN families, and experimentally determine the size of the pAPN-spectrum after swapping outputs for representatives from all infinite monomial APN families up to dimension n=10n = 10

    On the behavior of some APN permutations under swapping points

    Get PDF
    The article of record as published may be found at https://doi.org/10.1007/s12095-021-00520-zWe define the pAPN-spectrum (which is a measure of how close a function is to being APN) of an (n,n)-function F and investigate how its size changes when two of the outputs of a given function F are swapped. We completely characterize the behavior of the pAPN-spectrum under swapping outputs when F is the inverse function over F2n . We further theoretically investigate this behavior for functions from the Gold and Welch monomial APN families, and experimentally determine the size of the pAPN-spectrum after swapping outputs for representatives from all infinite monomial APN families up to dimension n = 10; based on our computation results, we conjecture that the inverse function is the only monomial APN function for which swapping two its outputs can leave an empty pAPN-spectrum

    On the behavior of some APN permutations under swapping points

    Get PDF
    Under embargo until: 2022-08-06We define the pAPN-spectrum (which is a measure of how close a function is to being APN) of an (n, n)-function F and investigate how its size changes when two of the outputs of a given function F are swapped. We completely characterize the behavior of the pAPN-spectrum under swapping outputs when F is the inverse function over F2n. We further theoretically investigate this behavior for functions from the Gold and Welch monomial APN families, and experimentally determine the size of the pAPN-spectrum after swapping outputs for representatives from all infinite monomial APN families up to dimension n = 10; based on our computation results, we conjecture that the inverse function is the only monomial APN function for which swapping two of its outputs can leave an empty pAPN-spectrum.acceptedVersio

    Protecting Micro-Data Privacy: The Moment-Based Density Estimation Method and its Application

    Get PDF
    Privacy concerns pertaining to the release of confidential micro-level information are increasingly relevant to organisations and institutions. Controlling the dissemination of disclosure-prone micro-data by means of suppression, aggregation and perturbation techniques often entails different levels of effectiveness and drawbacks depending on the context and properties of the data. In this dissertation, we briefly review existing disclosure control methods for microdata and undertake a study demonstrating the applicability of micro-data methods to proportion data. This is achieved by using the sample size efficiency related to a simple hypothesis test for a fixed significance level and power, as a measure of statistical utility. We compare a query-based differential privacy mechanism to the multiplicative noise method for disclosure control and demonstrate that with the correct specification of noise parameters, the multiplicative noise method, which is a micro-data based method, achieves similar disclosure protection properties with reduced statistical efficiency costs

    Patterns and Signals of Biology: An Emphasis On The Role of Post Translational Modifications in Proteomes for Function and Evolutionary Progression

    Get PDF
    After synthesis, a protein is still immature until it has been customized for a specific task. Post-translational modifications (PTMs) are steps in biosynthesis to perform this customization of protein for unique functionalities. PTMs are also important to protein survival because they rapidly enable protein adaptation to environmental stress factors by conformation change. The overarching contribution of this thesis is the construction of a computational profiling framework for the study of biological signals stemming from PTMs associated with stressed proteins. In particular, this work has been developed to predict and detect the biological mechanisms involved in types of stress response with PTMs in mitochondrial (Mt) and non-Mt protein. Before any mechanism can be studied, there must first be some evidence of its existence. This evidence takes the form of signals such as biases of biological actors and types of protein interaction. Our framework has been developed to locate these signals, distilled from “Big Data” resources such as public databases and the the entire PubMed literature corpus. We apply this framework to study the signals to learn about protein stress responses involving PTMs, modification sites (MSs). We developed of this framework, and its approach to analysis, according to three main facets: (1) by statistical evaluation to determine patterns of signal dominance throughout large volumes of data, (2) by signal location to track down the regions where the mechanisms must be found according to the types and numbers of associated actors at relevant regions in protein, and (3) by text mining to determine how these signals have been previously investigated by researchers. The results gained from our framework enable us to uncover the PTM actors, MSs and protein domains which are the major components of particular stress response mechanisms and may play roles in protein malfunction and disease

    Development and application of renormalised perturbation theory to models with strongly correlated electrons

    Get PDF
    The subject of this thesis is the application of the Renormalised Perturbation Theory (RPT) to models of magnetic impurities embedded in a non-magnetic host metal. The theoretical description of such models is particularly challenging, for they present strong correlations that render the usual perturbation theory around the non-interacting limit inapplicable. The RPT addresses this di fficulty by incorporating the concept of a quasi-particle into a perturbative framework, and organising the expansion in terms of the quasi-particle parameters of the model rather than the bare parameters; it can thus be carried out regardless of the strength of the interactions. In the present work we present an introduction to the theory and discuss in detail the calculation of the renormalised self-energy expansions for the Anderson impurity model. To cope with the complexity of high-order calculations we develop and implement a computer algorithm to automatically compute the diagrammatic expansion in the renormalised theory to any order. As a demonstration of the usefulness of the theory, we use it to calculate the conductance of a single quantum dot, and of two quantum dots with an inter-dot coupling, to leading order in the quasi-particle interaction. To perform calculations in the renormalised theory it is essential that the values of the renormalised parameters describing the quasi-particles are known. Here we develop a general method for determining them entirely within the RPT framework, which relies on constructing renormalisation flow equations relating the renormalised parameters of two models whose bare parameters differ in infinitesimally. By determining the renormalised parameters for a model with bare parameters that render it amenable to ordinary perturbation theory, and solving the flow equations to relate them to the renormalised parameters of models with progressively stronger correlations, we succeed in deducing the renormalised parameters for models with strong correlations.Open Acces

    Learning by Fusing Heterogeneous Data

    Get PDF
    It has become increasingly common in science and technology to gather data about systems at different levels of granularity or from different perspectives. This often gives rise to data that are represented in totally different input spaces. A basic premise behind the study of learning from heterogeneous data is that in many such cases, there exists some correspondence among certain input dimensions of different input spaces. In our work we found that a key bottleneck that prevents us from better understanding and truly fusing heterogeneous data at large scales is identifying the kind of knowledge that can be transferred between related data views, entities and tasks. We develop interesting and accurate data fusion methods for predictive modeling, which reduce or entirely eliminate some of the basic feature engineering steps that were needed in the past when inferring prediction models from disparate data. In addition, our work has a wide range of applications of which we focus on those from molecular and systems biology: it can help us predict gene functions, forecast pharmacological actions of small chemicals, prioritize genes for further studies, mine disease associations, detect drug toxicity and regress cancer patient survival data. Another important aspect of our research is the study of latent factor models. We aim to design latent models with factorized parameters that simultaneously tackle multiple types of data heterogeneity, where data diversity spans across heterogeneous input spaces, multiple types of features, and a variety of related prediction tasks. Our algorithms are capable of retaining the relational structure of a data system during model inference, which turns out to be vital for good performance of data fusion in certain applications. Our recent work included the study of network inference from many potentially nonidentical data distributions and its application to cancer genomic data. We also model the epistasis, an important concept from genetics, and propose algorithms to efficiently find the ordering of genes in cellular pathways. A central topic of our Thesis is also the analysis of large data compendia as predictions about certain phenomena, such as associations between diseases and involvement of genes in a certain phenotype, are only possible when dealing with lots of data. Among others, we analyze 30 heterogeneous data sets to assess drug toxicity and over 40 human gene association data collections, the largest number of data sets considered by a collective latent factor model up to date. We also make interesting observations about deciding which data should be considered for fusion and develop a generic approach that can estimate the sensitivities between different data sets

    Characterizing the Biochemical Determinants Governing MERS-Coronavirus Host Range

    Get PDF
    Coronaviruses are a diverse family of viruses that infect a wide range of hosts, including both mammalian and avian species. Within recent history, coronaviruses have expanded their host range into humans, with four emergence events resulting in infections that cause only mild disease. However, two additional emergence events resulted in outbreaks of severe disease, causing heightened concern for public health. The 2003 severe acute respiratory syndrome coronavirus (SARS-CoV) emerged in Southeast Asia and rapidly spread around the world with a 9 percent mortality rate before being controlled by public health intervention strategies. In 2012, Middle East respiratory syndrome coronavirus (MERS-CoV) emerged from its zoonotic reservoir. To date, it has infected over 1800 people with a 36 percent mortality rate and is still circulating in the population. Due to the emergence of coronaviruses with pandemic potential, it is important to understand how these lineages have been able to expand their host range to infect new species. One key determinant of viral host range is the interaction between the virus spike protein and the host cell receptor. For MERS-CoV specifically, the virus can infect bats, camels (the putative intermediate host species), and humans, but is unable to infect mice or other traditional small animal models due to receptor incompatibilities. The inability of MERS-CoV to infect any small animal model species leaves us unable to study pathogenesis or begin to develop potential vaccines or therapeutics. Here, I present work on the biochemical determinants that govern MERS-CoV host range. Specifically, I 1) characterize the interactions between the MERS-CoV receptor binding domain and the mouse cell receptor; 2) investigate biochemical determinants that govern infection for other species; 3) attempt to generate a mouse-adapted MERS-CoV; and 4) present an approach to investigate potential evolutionary mechanisms of coronavirus host range expansion. This work has contributed to the development of a small animal model, allowing us to begin pathogenesis studies. Additionally, understanding the biochemical determinants and evolutionary mechanisms of coronavirus host range expansion can help evaluate the pandemic potential of currently circulating zoonotic strains and better prepare us for future pathogenic coronaviruses that may emerge.Doctor of Philosoph

    Continuous Medial Models in Two-Sample Statistics of Shape

    Get PDF
    In questions of statistical shape analysis, the foremost is how such shapes should be represented. The number of parameters required for a given accuracy and the types of deformation they can express directly influence the quality and type of statistical inferences one can make. One example is a medial model, which represents a solid object using a skeleton of a lower dimension and naturally expresses intuitive changes such as "bending", "twisting", and "thickening". In this dissertation I develop a new three-dimensional medial model that allows continuous interpolation of the medial surface and provides a map back and forth between the boundary and its medial axis. It is the first such model to support branching, allowing the representation of a much wider class of objects than previously possible using continuous medial methods. A measure defined on the medial surface then allows one to write integrals over the boundary and the object interior in medial coordinates, enabling the expression of important object properties in an object-relative coordinate system. I show how these properties can be used to optimize correspondence during model construction. This improved correspondence reduces variability due to how the model is parameterized which could potentially mask a true shape change effect. Finally, I develop a method for performing global and local hypothesis testing between two groups of shapes. This method is capable of handling the nonlinear spaces the shapes live in and is well defined even in the high-dimension, low-sample size case. It naturally reduces to several well-known statistical tests in the linear and univariate cases
    corecore