151 research outputs found

    Mutual learning in a tree parity machine and its application to cryptography

    Full text link
    Mutual learning of a pair of tree parity machines with continuous and discrete weight vectors is studied analytically. The analysis is based on a mapping procedure that maps the mutual learning in tree parity machines onto mutual learning in noisy perceptrons. The stationary solution of the mutual learning in the case of continuous tree parity machines depends on the learning rate where a phase transition from partial to full synchronization is observed. In the discrete case the learning process is based on a finite increment and a full synchronized state is achieved in a finite number of steps. The synchronization of discrete parity machines is introduced in order to construct an ephemeral key-exchange protocol. The dynamic learning of a third tree parity machine (an attacker) that tries to imitate one of the two machines while the two still update their weight vectors is also analyzed. In particular, the synchronization times of the naive attacker and the flipping attacker recently introduced in [1] are analyzed. All analytical results are found to be in good agreement with simulation results

    Cryptography based on neural networks - analytical results

    Full text link
    Mutual learning process between two parity feed-forward networks with discrete and continuous weights is studied analytically, and we find that the number of steps required to achieve full synchronization between the two networks in the case of discrete weights is finite. The synchronization process is shown to be non-self-averaging and the analytical solution is based on random auxiliary variables. The learning time of an attacker that is trying to imitate one of the networks is examined analytically and is found to be much longer than the synchronization time. Analytical results are found to be in agreement with simulations

    Training a perceptron in a discrete weight space

    Full text link
    On-line and batch learning of a perceptron in a discrete weight space, where each weight can take 2L+12 L+1 different values, are examined analytically and numerically. The learning algorithm is based on the training of the continuous perceptron and prediction following the clipped weights. The learning is described by a new set of order parameters, composed of the overlaps between the teacher and the continuous/clipped students. Different scenarios are examined among them on-line learning with discrete/continuous transfer functions and off-line Hebb learning. The generalization error of the clipped weights decays asymptotically as exp(Kα2)exp(-K \alpha^2)/exp(eλα)exp(-e^{|\lambda| \alpha}) in the case of on-line learning with binary/continuous activation functions, respectively, where α\alpha is the number of examples divided by N, the size of the input vector and KK is a positive constant that decays linearly with 1/L. For finite NN and LL, a perfect agreement between the discrete student and the teacher is obtained for αLln(NL)\alpha \propto \sqrt{L \ln(NL)}. A crossover to the generalization error 1/α\propto 1/\alpha, characterized continuous weights with binary output, is obtained for synaptic depth L>O(N)L > O(\sqrt{N}).Comment: 10 pages, 5 figs., submitted to PR

    Synchronization of random walks with reflecting boundaries

    Full text link
    Reflecting boundary conditions cause two one-dimensional random walks to synchronize if a common direction is chosen in each step. The mean synchronization time and its standard deviation are calculated analytically. Both quantities are found to increase proportional to the square of the system size. Additionally, the probability of synchronization in a given step is analyzed, which converges to a geometric distribution for long synchronization times. From this asymptotic behavior the number of steps required to synchronize an ensemble of independent random walk pairs is deduced. Here the synchronization time increases with the logarithm of the ensemble size. The results of this model are compared to those observed in neural synchronization.Comment: 10 pages, 7 figures; introduction changed, typos correcte

    Multilayer neural networks with extensively many hidden units

    Full text link
    The information processing abilities of a multilayer neural network with a number of hidden units scaling as the input dimension are studied using statistical mechanics methods. The mapping from the input layer to the hidden units is performed by general symmetric Boolean functions whereas the hidden layer is connected to the output by either discrete or continuous couplings. Introducing an overlap in the space of Boolean functions as order parameter the storage capacity if found to scale with the logarithm of the number of implementable Boolean functions. The generalization behaviour is smooth for continuous couplings and shows a discontinuous transition to perfect generalization for discrete ones.Comment: 4 pages, 2 figure

    Bayesian nonparametric models for name disambiguation and supervised learning

    Get PDF
    This thesis presents new Bayesian nonparametric models and approaches for their development, for the problems of name disambiguation and supervised learning. Bayesian nonparametric methods form an increasingly popular approach for solving problems that demand a high amount of model flexibility. However, this field is relatively new, and there are many areas that need further investigation. Previous work on Bayesian nonparametrics has neither fully explored the problems of entity disambiguation and supervised learning nor the advantages of nested hierarchical models. Entity disambiguation is a widely encountered problem where different references need to be linked to a real underlying entity. This problem is often unsupervised as there is no previously known information about the entities. Further to this, effective use of Bayesian nonparametrics offer a new approach to tackling supervised problems, which are frequently encountered. The main original contribution of this thesis is a set of new structured Dirichlet process mixture models for name disambiguation and supervised learning that can also have a wide range of applications. These models use techniques from Bayesian statistics, including hierarchical and nested Dirichlet processes, generalised linear models, Markov chain Monte Carlo methods and optimisation techniques such as BFGS. The new models have tangible advantages over existing methods in the field as shown with experiments on real-world datasets including citation databases and classification and regression datasets. I develop the unsupervised author-topic space model for author disambiguation that uses free-text to perform disambiguation unlike traditional author disambiguation approaches. The model incorporates a name variant model that is based on a nonparametric Dirichlet language model. The model handles both novel unseen name variants and can model the unknown authors of the text of the documents. Through this, the model can disambiguate authors with no prior knowledge of the number of true authors in the dataset. In addition, it can do this when the authors have identical names. I use a model for nesting Dirichlet processes named the hybrid NDP-HDP. This model allows Dirichlet processes to be clustered together and adds an additional level of structure to the hierarchical Dirichlet process. I also develop a new hierarchical extension to the hybrid NDP-HDP. I develop this model into the grouped author-topic model for the entity disambiguation task. The grouped author-topic model uses clusters to model the co-occurrence of entities in documents, which can be interpreted as research groups. Since this model does not require entities to be linked to specific words in a document, it overcomes the problems of some existing author-topic models. The model incorporates a new method for modelling name variants, so that domain-specific name variant models can be used. Lastly, I develop extensions to supervised latent Dirichlet allocation, a type of supervised topic model. The keyword-supervised LDA model predicts document responses more accurately by modelling the effect of individual words and their contexts directly. The supervised HDP model has more model flexibility by using Bayesian nonparametrics for supervised learning. These models are evaluated on a number of classification and regression problems, and the results show that they outperform existing supervised topic modelling approaches. The models can also be extended to use similar information to the previous models, incorporating additional information such as entities and document titles to improve prediction

    Processing Spatial Keyword Query as a Top-k Aggregation Query

    Get PDF
    We examine the spatial keyword search problem to retrieve objects of interest that are ranked based on both their spatial proximity to the query location as well as the textual relevance of the object’s keywords. Existing solutions for the problem are based on either using a combination of textual and spatial indexes or using specialized hybrid indexes that integrate the indexing of both textual and spatial attribute values. In this paper, we propose a new approach that is based on modeling the problem as a top-k aggregation problem which enables the design of a scalable and efficient solution that is based on the ubiquitous inverted list index. Our performance study demonstrates that our approach outperforms the state-of-theart hybrid methods by a wide margin

    Antiretroviral Therapy Optimisation without Genotype Resistance Testing: A Perspective on Treatment History Based Models

    Get PDF
    BACKGROUND: Although genotypic resistance testing (GRT) is recommended to guide combination antiretroviral therapy (cART), funding and/or facilities to perform GRT may not be available in low to middle income countries. Since treatment history (TH) impacts response to subsequent therapy, we investigated a set of statistical learning models to optimise cART in the absence of GRT information. METHODS AND FINDINGS: The EuResist database was used to extract 8-week and 24-week treatment change episodes (TCE) with GRT and additional clinical, demographic and TH information. Random Forest (RF) classification was used to predict 8- and 24-week success, defined as undetectable HIV-1 RNA, comparing nested models including (i) GRT+TH and (ii) TH without GRT, using multiple cross-validation and area under the receiver operating characteristic curve (AUC). Virological success was achieved in 68.2% and 68.0% of TCE at 8- and 24-weeks (n\u200a=\u200a2,831 and 2,579), respectively. RF (i) and (ii) showed comparable performances, with an average (st.dev.) AUC 0.77 (0.031) vs. 0.757 (0.035) at 8-weeks, 0.834 (0.027) vs. 0.821 (0.025) at 24-weeks. Sensitivity analyses, carried out on a data subset that included antiretroviral regimens commonly used in low to middle income countries, confirmed our findings. Training on subtype B and validation on non-B isolates resulted in a decline of performance for models (i) and (ii). CONCLUSIONS: Treatment history-based RF prediction models are comparable to GRT-based for classification of virological outcome. These results may be relevant for therapy optimisation in areas where availability of GRT is limited. Further investigations are required in order to account for different demographics, subtypes and different therapy switching strategies
    corecore