28 research outputs found

    A new distance for data sets (and probability measures) in a RKHS context

    Get PDF
    In this paper we define distance functions for data sets (and distributions) in a RKHS context. To this aim we introduce kernels for data sets that provide a metrization of the set of points sets (the power set). An interesting point in the proposed kernel distance is that it takes into account the underlying (data) generating probability distributions. In particular, we propose kernel distances that rely on the estimation of density level sets of the underlying distribution, and can be extended from data sets to probability measures. The performance of the proposed distances is tested on a variety of simulated distributions plus a couple of real pattern recognition problemsThis work was partially supported by projectsDGUCM 2008/00058/002, MEC 2007/04438/001 and MIC 2012/00084/0

    Unsupervised Structural Embedding Methods for Efficient Collective Network Mining

    Full text link
    How can we align accounts of the same user across social networks? Can we identify the professional role of an email user from their patterns of communication? Can we predict the medical effects of chemical compounds from their atomic network structure? Many problems in graph data mining, including all of the above, are defined on multiple networks. The central element to all of these problems is cross-network comparison, whether at the level of individual nodes or entities in the network or at the level of entire networks themselves. To perform this comparison meaningfully, we must describe the entities in each network expressively in terms of patterns that generalize across the networks. Moreover, because the networks in question are often very large, our techniques must be computationally efficient. In this thesis, we propose scalable unsupervised methods that embed nodes in vector space by mapping nodes with similar structural roles in their respective networks, even if they come from different networks, to similar parts of the embedding space. We perform network alignment by matching nodes across two or more networks based on the similarity of their embeddings, and refine this process by reinforcing the consistency of each node’s alignment with those of its neighbors. By characterizing the distribution of node embeddings in a graph, we develop graph-level feature vectors that are highly effective for graph classification. With principled sparsification and randomized approximation techniques, we make all our methods computationally efficient and able to scale to graphs with millions of nodes or edges. We demonstrate the effectiveness of structural node embeddings on industry-scale applications, and propose an extensive set of embedding evaluation techniques that lay the groundwork for further methodological development and application.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162895/1/mheimann_1.pd

    Constrained Delaunay triangulation for diagnosis and grading of colon cancer

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2009.Thesis (Master's) -- Bilkent University, 2009.Includes bibliographical references leaves 93-107.In our century, the increasing rate of cancer incidents makes it inevitable to employ computerized tools that aim to help pathologists more accurately diagnose and grade cancerous tissues. These mathematical tools offer more stable and objective frameworks, which cause a reduced rate of intra- and inter-observer variability. There has been a large set of studies on the subject of automated cancer diagnosis/grading, especially based on textural and/or structural tissue analysis. Although the previous structural approaches show promising results for different types of tissues, they are still unable to make use of the potential information that is provided by tissue components rather than cell nuclei. However, this additional information is one of the major information sources for the tissue types with differentiated components including luminal regions being useful to describe glands in a colon tissue. This thesis introduces a novel structural approach, a new type of constrained Delaunay triangulation, for the utilization of non-nuclei tissue components. This structural approach first defines two sets of nodes on cell nuclei and luminal regions. It then constructs a constrained Delaunay triangulation on the nucleus nodes with the lumen nodes forming its constraints. Finally, it classifies the tissue samples using the features extracted from this newly introduced constrained Delaunay triangulation. Working with 213 colon tissues taken from 58 patients, our experiments demonstrate that the constrained Delaunay triangulation approach leads to higher accuracies of 87.83 percent and 85.71 percent for the training and test sets, respectively. The experiments also show that the introduction of this new structural representation, which allows definition of new features, provides a more robust graph-based methodology for the examination of cancerous tissues and better performance than its predecessors.ErdoÄźan, SĂĽleyman TuncerM.S

    Analysis and resynthesis of polyphonic music

    Get PDF
    This thesis examines applications of Digital Signal Processing to the analysis, transformation, and resynthesis of musical audio. First I give an overview of the human perception of music. I then examine in detail the requirements for a system that can analyse, transcribe, process, and resynthesise monaural polyphonic music. I then describe and compare the possible hardware and software platforms. After this I describe a prototype hybrid system that attempts to carry out these tasks using a method based on additive synthesis. Next I present results from its application to a variety of musical examples, and critically assess its performance and limitations. I then address these issues in the design of a second system based on Gabor wavelets. I conclude by summarising the research and outlining suggestions for future developments

    Vulnerability Assessment and Privacy-preserving Computations in Smart Grid

    Get PDF
    Modern advances in sensor, computing, and communication technologies enable various smart grid applications which highlight the vulnerability that requires novel approaches to the field of cybersecurity. While substantial numbers of technologies have been adopted to protect cyber attacks in smart grid, there lacks a comprehensive review of the implementations, impacts, and solutions of cyber attacks specific to the smart grid.In this dissertation, we are motivated to evaluate the security requirements for the smart grid which include three main properties: confidentiality, integrity, and availability. First, we review the cyber-physical security of the synchrophasor network, which highlights all three aspects of security issues. Taking the synchrophasor network as an example, we give an overview of how to attack a smart grid network. We test three types of attacks and show the impact of each attack consisting of denial-of-service attack, sniffing attack, and false data injection attack.Next, we discuss how to protect against each attack. For protecting availability, we examine possible defense strategies for the associated vulnerabilities.For protecting data integrity, a small-scale prototype of secure synchrophasor network is presented with different cryptosystems. Besides, a deep learning based time-series anomaly detector is proposed to detect injected measurement. Our approach observes both data measurements and network traffic features to jointly learn system states and can detect attacks when state vector estimator fails.For protecting data confidentiality, we propose privacy-preserving algorithms for two important smart grid applications. 1) A distributed privacy-preserving quadratic optimization algorithm to solve Security Constrained Optimal Power Flow (SCOPF) problem. The SCOPF problem is decomposed into small subproblems using the Alternating Direction Method of Multipliers (ADMM) and gradient projection algorithms. 2) We use Paillier cryptosystem to secure the computation of the power system dynamic simulation. The IEEE 3-Machine 9-Bus System is used to implement and demonstrate the proposed scheme. The security and performance analysis of our implementations demonstrate that our algorithms can prevent chosen-ciphertext attacks at a reasonable cost

    Causal Discovery for Relational Domains: Representation, Reasoning, and Learning

    Get PDF
    Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks. Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively been applied to directed graphical models of propositional data that assume a single type of entity with independence among instances. However, most real-world domains are characterized by systems that involve complex interactions among multiple types of entities. Many state-of-the-art methods in statistics and machine learning that address such complex systems focus on learning associational models, and they are oftentimes mistakenly interpreted as causal. The intersection between causal discovery and machine learning in complex systems is small. The primary objective of this thesis is to extend causal discovery to such complex systems. Specifically, I formalize a relational representation and model that can express the causal and probabilistic dependencies among the attributes of interacting, heterogeneous entities. I show that the traditional method for reasoning about statistical independence from model structure fails to accurately derive conditional independence facts from relational models. I introduce a new theory—relational d-separation—and a novel, lifted representation—the abstract ground graph—that supports a sound, complete, and computationally efficient method for algorithmically deriving conditional independencies from probabilistic models of relational data. The abstract ground graph representation also presents causal implications that enable the detection of causal direction for bivariate relational dependencies without parametric assumptions. I leverage these implications and the theoretical framework of relational d-separation to develop a sound and complete algorithm—the relational causal discovery (RCD) algorithm—that learns causal structure from relational data

    Advances in molecular quantum chemistry contained in the Q-Chem 4 program package

    Get PDF
    A summary of the technical advances that are incorporated in the fourth major release of the Q-CHEM quantum chemistry program is provided, covering approximately the last seven years. These include developments in density functional theory methods and algorithms, nuclear magnetic resonance (NMR) property evaluation, coupled cluster and perturbation theories, methods for electronically excited and open-shell species, tools for treating extended environments, algorithms for walking on potential surfaces, analysis tools, energy and electron transfer modelling, parallel computing capabilities, and graphical user interfaces. In addition, a selection of example case studies that illustrate these capabilities is given. These include extensive benchmarks of the comparative accuracy of modern density functionals for bonded and non-bonded interactions, tests of attenuated second order Møller–Plesset (MP2) methods for intermolecular interactions, a variety of parallel performance benchmarks, and tests of the accuracy of implicit solvation models. Some specific chemical examples include calculations on the strongly correlated Cr2 dimer, exploring zeolite-catalysed ethane dehydrogenation, energy decomposition analysis of a charged ter-molecular complex arising from glycerol photoionisation, and natural transition orbitals for a Frenkel exciton state in a nine-unit model of a self-assembling nanotube

    Semantic and effective communications

    Get PDF
    Shannon and Weaver categorized communications into three levels of problems: the technical problem, which tries to answer the question "how accurately can the symbols of communication be transmitted?"; the semantic problem, which asks the question "how precisely do the transmitted symbols convey the desired meaning?"; the effectiveness problem, which strives to answer the question "how effectively does the received meaning affect conduct in the desired way?". Traditionally, communication technologies mainly addressed the technical problem, ignoring the semantics or the effectiveness problems. Recently, there has been increasing interest to address the higher level semantic and effectiveness problems, with proposals ranging from semantic to goal oriented communications. In this thesis, we propose to formulate the semantic problem as a joint source-channel coding (JSCC) problem and the effectiveness problem as a multi-agent partially observable Markov decision process (MA-POMDP). As such, for the semantic problem, we propose DeepWiVe, the first-ever end-to-end JSCC video transmission scheme that leverages the power of deep neural networks (DNNs) to directly map video signals to channel symbols, combining video compression, channel coding, and modulation steps into a single neural transform. We also further show that it is possible to use predefined constellation designs as well as secure the physical layer communication against eavesdroppers for deep learning (DL) driven JSCC schemes, making such schemes much more viable for deployment in the real world. For the effectiveness problem, we propose a novel formulation by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a MA-POMDP, in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively'' over a noisy channel. Moreover, we show that this framework generalizes both the semantic and technical problems. In both instances, we show that the resultant communication scheme is superior to one where the communication is considered separately from the underlying semantic or goal of the problem.Open Acces

    Secure fingerprinting on sound foundations

    Get PDF
    The rapid development and the advancement of digital technologies open a variety of opportunities to consumers and content providers for using and trading digital goods. In this context, particularly the Internet has gained a major ground as a worldwiede platform for exchanging and distributing digital goods. Beside all its possibilities and advantages digital technology can be misuesd to breach copyright regulations: unauthorized use and illegal distribution of intellectual property cause authors and content providers considerable loss. Protections of intellectual property has therefore become one of the major challenges of our information society. Fingerprinting is a key technology in copyright protection of intellectual property. Its goal is to deter people from copyright violation by allowing to provably identify the source of illegally copied and redistributed content. As one of its focuses, this thesis considers the design and construction of various fingerprinting schemes and presents the first explicit, secure and reasonably efficient construction for a fingerprinting scheme which fulfills advanced security requirements such as collusion-tolerance, asymmetry, anonymity and direct non-repudiation. Crucial for the security of such s is a careful study of the underlying cryptographic assumptions. In case of the fingerprinting scheme presented here, these are mainly assumptions related to discrete logarithms. The study and analysis of these assumptions is a further focus of this thesis. Based on the first thorough classification of assumptions related to discrete logarithms, this thesis gives novel insights into the relations between these assumptions. In particular, depending on the underlying probability space we present new reuslts on the reducibility between some of these assumptions as well as on their reduction efficency.Die Fortschritte im Bereich der Digitaltechnologien bieten Konsumenten, Urhebern und Anbietern große Potentiale für innovative Geschäftsmodelle zum Handel mit digitalen Gütern und zu deren Nutzung. Das Internet stellt hierbei eine interessante Möglichkeit zum Austausch und zur Verbreitung digitaler Güter dar. Neben vielen Vorteilen kann die Digitaltechnik jedoch auch missbräuchlich eingesetzt werden, wie beispielsweise zur Verletzung von Urheberrechten durch illegale Nutzung und Verbreitung von Inhalten, wodurch involvierten Parteien erhebliche Schäden entstehen können. Der Schutz des geistigen Eigentums hat sich deshalb zu einer der besonderen Herausforderungen unseres Digitalzeitalters entwickelt. Fingerprinting ist eine Schlüsseltechnologie zum Urheberschutz. Sie hat das Ziel, vor illegaler Vervielfältigung und Verteilung digitaler Werke abzuschrecken, indem sie die Identifikation eines Betrügers und das Nachweisen seines Fehlverhaltens ermöglicht. Diese Dissertation liefert als eines ihrer Ergebnisse die erste explizite, sichere und effiziente Konstruktion, welche die Berücksichtigung besonders fortgeschrittener Sicherheitseigenschaften wie Kollusionstoleranz, Asymmetrie, Anonymität und direkte Unabstreitbarkeit erlaubt. Entscheidend für die Sicherheit kryptographischer Systeme ist die präzise Analyse der ihnen zugrunde liegenden kryptographischen Annahmen. Den im Rahmen dieser Dissertation konstruierten Fingerprintingsystemen liegen hauptsächlich kryptographische Annahmen zugrunde, welche auf diskreten Logarithmen basieren. Die Untersuchung dieser Annahmen stellt einen weiteren Schwerpunkt dieser Dissertation dar. Basierend auf einer hier erstmals in der Literatur vorgenommenen Klassifikation dieser Annahmen werden neue und weitreichende Kenntnisse über deren Zusammenhänge gewonnen. Insbesondere werden, in Abhängigkeit von dem zugrunde liegenden Wahrscheinlichkeitsraum, neue Resultate hinsichtlich der Reduzierbarkeit dieser Annahmen und ihrer Reduktionseffizienz erzielt

    Embryonic Stem Cells

    Get PDF
    Embryonic stem cells are one of the key building blocks of the emerging multidisciplinary field of regenerative medicine, and discoveries and new technology related to embryonic stem cells are being made at an ever increasing rate. This book provides a snapshot of some of the research occurring across a wide range of areas related to embryonic stem cells, including new methods, tools and technologies; new understandings about the molecular biology and pluripotency of these cells; as well as new uses for and sources of embryonic stem cells. The book will serve as a valuable resource for engineers, scientists, and clinicians as well as students in a wide range of disciplines
    corecore