Search CORE

101 research outputs found

Investigating Randomised Sphere Covers in Supervised Learning

Author: Younsi Reda
Publication venue
Publication date: 01/01/2011
Field of study

c©This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with the author and that no quotation from the thesis, nor any information derived therefrom, may be published without the author’s prior, written consent. In this thesis, we thoroughly investigate a simple Instance Based Learning (IBL) classifier known as Sphere Cover. We propose a simple Randomized Sphere Cover Classifier (αRSC) and use several datasets in order to evaluate the classification performance of the αRSC classifier. In addition, we analyse the generalization error of the proposed classifier using bias/variance decomposition. A Sphere Cover Classifier may be described from the compression scheme which stipulates data compression as the reason for high generalization performance. We investigate the compression capacity of αRSC using a sample compression bound. The Compression Scheme prompted us to search new compressibility methods for αRSC. As such, we used a Gaussian kernel to investigate further data compression

CiteSeerX

University of East Anglia digital repository

LIPIcs, Volume 274, ESA 2023, Complete Volume

Author: Farach-Colton Martin
Herman Grzegorz
Puglisi Simon J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 274, ESA 2023, Complete Volum

Dagstuhl Research Online Publication Server

Statistical Tools for Directed and Bipartite Networks

Author: Yoo Hyesun
Publication venue
Publication date: 01/01/2020
Field of study

Directed networks and bipartite networks, which exhibit unique asymmetric connectivity structures, are commonly observed in a variety of scientific and engineering fields. Despite their abundance and utility, most network analysis methods only consider symmetric networks. In this thesis, we develop statistical methods and theory for directed and bipartite networks. The first chapter focuses on matched community detection in a bipartite network. The detection of matched communities, i.e. communities that consist of nodes of two types that are closely connected with one another, is a fundamental and challenging problem. Most widely used approaches for matched community detection are either computationally inefficient or prone to non-ideal performance. We propose a new two-stage algorithm that uses fast spectral methods to recover matched communities. We show that, for bipartite networks, it is critical to adjust for the community size in matched community detection, which had not been considered before. We also provide theoretical error bounds for the proposed algorithm on the number of mis-clustered nodes under a variant of the stochastic block model. Numerical studies indicate that the proposed method outperforms existing spectral algorithms, especially when the sizes of the matched communities are proportionally different between the two types. The second chapter of the thesis introduces a new preference-based block model for community detection in a directed network. Unlike existing models, the proposed model allows different sender nodes to have different preferences to communities in the network. We argue that the right singular vectors of a graph Laplacian matrix contain community structures under the model. Further, we propose a spectral clustering algorithm to detect communities and estimate parameters of the model. Theoretical results show insights on how the heterogeneity of preferences and out-degrees contribute to an upper bound of the number of mis-clustered nodes. Numerical studies support the theoretical results and illustrate the outstanding performance of the proposed method. The model can also be naturally extended to bipartite networks. In the third chapter, we propose a dyadic latent space model which accommodates the reciprocity between a pair of nodes in directed networks. Nodes in a pair in directed networks often exhibit strong dependencies with each other, though most widely used approaches usually account for this phenomenon with limited flexibility. We propose a new latent space model for directed networks that incorporates the reciprocity in a flexible way, allowing for important characteristics such as homophily and heterogeneity of the nodes. A fast and scalable algorithm based on projected gradient descent has been developed to fit the model by maximizing the likelihood. Both simulation studies and real-world data examples illustrate that the proposed model is effective in various network analysis tasks including link prediction and community detection.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163156/1/yoohs_1.pd

Deep Blue Documents at the University of Michigan

A Bayesian Approach to Learning Hidden Markov Model Topology with Applications to Biological Sequence Analysis

Author: Schliep Alexander
Publication venue
Publication date: 01/01/2001
Field of study

Hidden-Markov-Models (HMMs) are a widely and successfully used tool in statistical modeling and statistical pattern recognition. One fundamental problem in the application of HMMs is finding the underlying architecture or topology, particularly when there is no strong evidence from the application domain — e.g., when doing black box modeling. Topology is important with regard to good parameter estimates and with regard to performance: A model with “too many” states — and hence too many parameters — requires too much training data while an model with “not enough” states impedes the HMM from capturing subtle statistical patterns. We have developed a novel algorithm that, given sequence data originating from an ergodic process, infers an HMM, its topology and its parameters. We introduce a Bayesian approach

CiteSeerX

computer science publication server

Kölner UniversitätsPublikationsServer

Machine Learning

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

Directory of Open Access Books (DOAB)

Recommended from our members

Distributed optimal and predictive control methods for networks of dynamic systems

Author: Vlahakis E. E.
Publication venue
Publication date
Field of study

Several recent approaches to distributed control design over networks of interconnected dynamic systems rely on certain assumptions, such as identical subsystem dynamics, absence of dynamical couplings, linear dynamics and undirected interaction schemes. In this thesis, we investigate systematic methods for relaxing a number of simplifying factors leading to a unifying approach for solving general distributed-control stabilization problems of networks of dynamic agents. We show that the gain-margin property of LQR control holds for complex multiplicative input perturbations and a generic symmetric positive definite input weighting matrix. Proving also that the potentially non-simple structure of the Laplacian matrix can be neglected for stability analysis and control design, we extend two well-known distributed LQR-based control methods originally established for undirected networks of identical linear systems, to the directed case. We then propose a distributed feedback method for tackling large-scale regulation problems of a general class of interconnected non-identical dynamic agents with undirected and directed topology. In particular, we assume that local agents share a minimal set of structural properties, such as input dimension, state dimension and controllability indices. Our approach relies on the solution of certain model matching type problems using local linear state-feedback and input matrix transformations which map the agent dynamics to a target system, selected to minimize the joint control effort of the local feedback-control schemes. By adapting well-established distributed LQR control design methodologies to our framework, the stabilization problem of a network of non-identical dynamical agents is solved. We thereafter consider a networked scheme synthesized by multiple agents with nonlinear dynamics. Assuming that agents are feedback linearizable in a neighborhood near their equilibrium points, we propose a nonlinear model matching control design for stabilizing networks of multiple heterogeneous nonlinear agents. Motivated by the structure of a large-scale LQR optimal problem, we propose a stabilizing distributed state-feedback controller for networks of identical dynamically coupled linear agents. First, a fully centralized controller is designed which is subsequently substituted by a distributed state-feedback gain with sparse structure. The control scheme is obtained byoptimizing an LQR performance index with a tuning parameter utilized to emphasize/deemphasize relative state difference between coupled systems. Sufficient conditions for stability of the proposed scheme are derived based on the inertia of a convex combination of two Hurwitz matrices. An extended simulation study involving distributed load frequency control design of a multi-area power network, illustrates the applicability of the proposed method. Finally, we propose a fully distributed consensus-based model matching scheme adapted to a model predictive control setting for tackling a structured receding horizon regulation problem

City Research Online

Statistical Inference for Propagation Processes on Complex Networks

Author: Manitz Juliane
Publication venue
Publication date: 12/06/2014
Field of study

Die Methoden der Netzwerktheorie erfreuen sich wachsender Beliebtheit, da sie die Darstellung von komplexen Systemen durch Netzwerke erlauben. Diese werden nur mit einer Menge von Knoten erfasst, die durch Kanten verbunden werden. Derzeit verfügbare Methoden beschränken sich hauptsächlich auf die deskriptive Analyse der Netzwerkstruktur. In der hier vorliegenden Arbeit werden verschiedene Ansätze für die Inferenz über Prozessen in komplexen Netzwerken vorgestellt. Diese Prozesse beeinflussen messbare Größen in Netzwerkknoten und werden durch eine Menge von Zufallszahlen beschrieben. Alle vorgestellten Methoden sind durch praktische Anwendungen motiviert, wie die Übertragung von Lebensmittelinfektionen, die Verbreitung von Zugverspätungen, oder auch die Regulierung von genetischen Effekten. Zunächst wird ein allgemeines dynamisches Metapopulationsmodell für die Verbreitung von Lebensmittelinfektionen vorgestellt, welches die lokalen Infektionsdynamiken mit den netzwerkbasierten Transportwegen von kontaminierten Lebensmitteln zusammenführt. Dieses Modell ermöglicht die effiziente Simulationen verschiedener realistischer Lebensmittelinfektionsepidemien. Zweitens wird ein explorativer Ansatz zur Ursprungsbestimmung von Verbreitungsprozessen entwickelt. Auf Grundlage einer netzwerkbasierten Redefinition der geodätischen Distanz können komplexe Verbreitungsmuster in ein systematisches, kreisrundes Ausbreitungsschema projiziert werden. Dies gilt genau dann, wenn der Ursprungsnetzwerkknoten als Bezugspunkt gewählt wird. Die Methode wird erfolgreich auf den EHEC/HUS Epidemie 2011 in Deutschland angewandt. Die Ergebnisse legen nahe, dass die Methode die aufwändigen Standarduntersuchungen bei Lebensmittelinfektionsepidemien sinnvoll ergänzen kann. Zudem kann dieser explorative Ansatz zur Identifikation von Ursprungsverspätungen in Transportnetzwerken angewandt werden. Die Ergebnisse von umfangreichen Simulationsstudien mit verschiedenstensten Übertragungsmechanismen lassen auf eine allgemeine Anwendbarkeit des Ansatzes bei der Ursprungsbestimmung von Verbreitungsprozessen in vielfältigen Bereichen hoffen. Schließlich wird gezeigt, dass kernelbasierte Methoden eine Alternative für die statistische Analyse von Prozessen in Netzwerken darstellen können. Es wurde ein netzwerkbasierter Kern für den logistischen Kernel Machine Test entwickelt, welcher die nahtlose Integration von biologischem Wissen in die Analyse von Daten aus genomweiten Assoziationsstudien erlaubt. Die Methode wird erfolgreich bei der Analyse genetischer Ursachen für rheumatische Arthritis und Lungenkrebs getestet. Zusammenfassend machen die Ergebnisse der vorgestellten Methoden deutlich, dass die Netzwerk-theoretische Analyse von Verbreitungsprozessen einen wesentlichen Beitrag zur Beantwortung verschiedenster Fragestellungen in unterschiedlichen Anwendungen liefern kann

Georg-August-University Göttingen

Statistical physics approaches to large-scale socio-economic networks

Author: Szell Michael
Publication venue
Publication date: 01/01/2011
Field of study

Die statistische Physik erforschte im letzten Jahrzehnt eine Fülle von wissenschaftlichen Gebieten, was zu einem besseren quantitativen Verständnis von verschiedenen, aus vielen Elementen bestehenden Systemen, z.B. von sozialen Systemen, geführt hat. Eine empirische Quantifizierung von menschlichem Verhalten auf gesellschaftlichem Niveau hat sich allerdings bisher als sehr schwierig erwiesen, wegen Problemen bei der Gewinnung und Qualität von Daten. In dieser Doktorarbeit erstellen wir zum ersten mal einen umfangreichen über fünf Jahre gesammelten Datensatz, der praktisch alle Aktionen und Eigenschaften der 350.000 Teilnehmer einer gesamten menschlichen Gesellschaft aus einem selbstentwickelten Massive Multiplayer Online Game enthält. Wir beschreiben dieses aus stark wechselwirkenden Spielern bestehende soziale System in drei Ebenen. In einem ersten Schritt analysieren wir die Individuen und deren Verhalten im Verlauf der Zeit. Eine Skalen- und Fluktuationsanalyse von Aktions-Reaktions-Zeitreihen enthüllt Persistenz der möglichen Aktionen und qualitative Unterschiede zwischen "guten" und "schlechten" Spielern. Wir untersuchen danach den Diffusionsprozess der im Spieluniversum stattfindenden Bewegungen der Individuen. Wir finden Subdiffusivität und eine durch ein Potenzgesetz verteilte Präferenz zu kürzlich besuchten Orten zurückzukehren. Zweitens, auf der nächsthöheren Ebene, verwenden wir Netzwerktheorie um die topologische Struktur der Interaktionen zwischen Individuen zu quantifizieren. Wir konzentrieren uns auf sechs durch direkte Interaktionen definierte Netzwerke, drei davon positiv (Handel, Freundschaft, Kommunikation), drei negativ (Feindschaft, Attacke, Bestrafung). Diese Netzwerke weisen nichttriviale statistische Eigenschaften auf, z.B. skaleninvariante Topologie, und entwickeln sich in der Zeit, was uns erlaubt eine Reihe von Hypothesen über sozialdynamische Phänomene zu testen. Wir finden qualitative Unterschiede zwischen positiven und negativen Netzwerken in Evolution und Struktur. Schließlich untersuchen wir das Multiplex-Netzwerk der Spielergesellschaft, das sich aus den einzelnen Netzwerk-Schichten zusammensetzt. Wir quantifizieren Interaktionen zwischen verschiedenen Netzwerken und zeigen die nichttrivialen Organisationsprinzipien auf die auch in echten menschlichen Gesellschaften beobachtet wurden. Unsere Erkenntnisse liefern Belege für die Hypothese der strukturellen Balance, die eine Vermeidung von gewissen frustrierten Zuständen auf mikroskopischem Niveau postuliert. Mit diesem Aufbau demonstrieren wir die Möglichkeit der Gewinnung neuartiger wissenschaftlicher Erkenntnisse über die Natur von kollektivem menschlichen Verhalten in großangelegten sozialen Systemen.In the past decade a variety of fields has been explored by statistical physicists, leading to an increase of our quantitative understanding of various systems composed of many interacting elements, such as social systems. However, an empirical quantification of human behavior on a societal level has so far proved to be tremendously difficult due to problems in data availability, quality and ways of acquisition. In this doctoral thesis we compile for the first time a large-scale data set consisting of practically all actions and properties of 350,000 odd participants of an entire human society interacting in a self-developed Massive Multiplayer Online Game, over a period of five years. We describe this social system composed of strongly interacting players in the game in three consecutive levels. In a first step, we examine the individuals and their behavioral properties over time. A scaling and fluctuation analysis of action-reaction time-series reveals persistence of the possible actions and qualitative differences between "good" and "bad" players. We then study and model the diffusion process of human mobility occurring within the "game universe". We find subdiffusion and a power-law distributed preference to return to more recently visited locations. Second, on a higher level, we use network theory to quantify the topological structure of interactions between the individuals. We focus on six network types defined by direct interactions, three of them with a positive connotation (trade, friendship, communication), three with a negative one (enmity, attack, punishment). These networks exhibit non-trivial statistical properties, e.g. scale-free topology, and evolve over time, allowing to test a series of long-standing social-dynamics hypotheses. We find qualitative differences in evolution and topological structure between positive and negative tie networks. Finally, on a yet higher level, we consider the multiplex network of the player society, constituted by the coupling of the single network layers. We quantify interactions between different networks and detect the non-trivial organizational principles which lead to the observed structure of the system and which have been observed in real human societies as well. Our findings with the multiplex framework provide evidence for the half-century old hypothesis of structural balance, where certain frustrated states on a microscopic level tend to be avoided. Within this setup we demonstrate the feasibility for generating novel scientific insights on the nature of collective human behavior in large-scale social systems

OTHES