Search CORE

28 research outputs found

A Bayesian Method for Matching Two Similar Graphs without Seeds

Author: Figueiredo Daniel R.
Grossglauser Matthias
Pedarsani Pedram
Publication venue
Publication date: 08/05/2015
Field of study

Approximate graph matching (AGM) refers to the problem of mapping the vertices of two structurally similar graphs, which has applications in social networks, computer vision, chemistry, and biology. Given its computational cost, AGM has mostly been limited to either small graphs (e.g., tens or hundreds of nodes), or to large graphs in combination with side information beyond the graph structure (e.g., a seed set of pre-mapped node pairs). In this paper, we cast AGM in a Bayesian framework based on a clean definition of the probability of correctly mapping two nodes, which leads to a polynomial time algorithm that does not require side information. Node features such as degree and distances to other nodes are used as fingerprints. The algorithm proceeds in rounds, such that the most likely pairs are mapped first; these pairs subsequently generate additional features in the fingerprints of other nodes. We evaluate our method over real social networks and show that it achieves a very low matching error provided the two graphs are sufficiently similar. We also evaluate our method on random graph models to characterize its behavior under various levels of node clustering

Infoscience - École polytechnique fédérale de Lausanne

Novel approaches to anonymity and privacy in decentralized, open settings

Author: Manoharan Praveen
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

The Internet has undergone dramatic changes in the last two decades, evolving from a mere communication network to a global multimedia platform in which billions of users actively exchange information. While this transformation has brought tremendous benefits to society, it has also created new threats to online privacy that existing technology is failing to keep pace with. In this dissertation, we present the results of two lines of research that developed two novel approaches to anonymity and privacy in decentralized, open settings. First, we examine the issue of attribute and identity disclosure in open settings and develop the novel notion of (k,d)-anonymity for open settings that we extensively study and validate experimentally. Furthermore, we investigate the relationship between anonymity and linkability using the notion of (k,d)-anonymity and show that, in contrast to the traditional closed setting, anonymity within one online community does necessarily imply unlinkability across different online communities in the decentralized, open setting. Secondly, we consider the transitive diffusion of information that is shared in social networks and spread through pairwise interactions of user connected in this social network. We develop the novel approach of exposure minimization to control the diffusion of information within an open network, allowing the owner to minimize its exposure by suitably choosing who they share their information with. We implement our algorithms and investigate the practical limitations of user side exposure minimization in large social networks. At their core, both of these approaches present a departure from the provable privacy guarantees that we can achieve in closed settings and a step towards sound assessments of privacy risks in decentralized, open settings.Das Internet hat in den letzten zwei Jahrzehnten eine drastische Transformation erlebt und entwickelte sich dabei von einem einfachen Kommunikationsnetzwerk zu einer globalen Multimedia Plattform auf der Milliarden von Nutzern aktiv Informationen austauschen. Diese Transformation hat zwar einen gewaltigen Nutzen und vielfältige Vorteile für die Gesellschaft mit sich gebracht, hat aber gleichzeitig auch neue Herausforderungen und Gefahren für online Privacy mit sich gebracht mit der die aktuelle Technologie nicht mithalten kann. In dieser Dissertation präsentieren wir zwei neue Ansätze für Anonymität und Privacy in dezentralisierten und offenen Systemen. Mit unserem ersten Ansatz untersuchen wir das Problem der Attribut- und Identitätspreisgabe in offenen Netzwerken und entwickeln hierzu den Begriff der (k, d)-Anonymität für offene Systeme welchen wir extensiv analysieren und anschließend experimentell validieren. Zusätzlich untersuchen wir die Beziehung zwischen Anonymität und Unlinkability in offenen Systemen mithilfe des Begriff der (k, d)-Anonymität und zeigen, dass, im Gegensatz zu traditionell betrachteten, abgeschlossenen Systeme, Anonymität innerhalb einer Online Community nicht zwingend die Unlinkability zwischen verschiedenen Online Communitys impliziert. Mit unserem zweiten Ansatz untersuchen wir die transitive Diffusion von Information die in Sozialen Netzwerken geteilt wird und sich dann durch die paarweisen Interaktionen von Nutzern durch eben dieses Netzwerk ausbreitet. Wir entwickeln eine neue Methode zur Kontrolle der Ausbreitung dieser Information durch die Minimierung ihrer Exposure, was dem Besitzer dieser Information erlaubt zu kontrollieren wie weit sich deren Information ausbreitet indem diese initial mit einer sorgfältig gewählten Menge von Nutzern geteilt wird. Wir implementieren die hierzu entwickelten Algorithmen und untersuchen die praktischen Grenzen der Exposure Minimierung, wenn sie von Nutzerseite für große Netzwerke ausgeführt werden soll. Beide hier vorgestellten Ansätze verbindet eine Neuausrichtung der Aussagen die diese bezüglich Privacy treffen: wir bewegen uns weg von beweisbaren Privacy Garantien für abgeschlossene Systeme, und machen einen Schritt zu robusten Privacy Risikoeinschätzungen für dezentralisierte, offene Systeme in denen solche beweisbaren Garantien nicht möglich sind

Universaar

Acronym

Privacy and Dynamics of Social Networks (PhD Thesis: pre-print)

Author: Pedarsani Pedram
Publication venue
Publication date: 21/12/2012
Field of study

Over the past decade, investigations in different fields have focused on studying and understanding real networks, ranging from biological to social to technological. These networks, called complex networks, exhibit common topological features, such as a heavy-tailed degree distribution and the small world effect. In this thesis we address two interesting aspects of complex, and more specifically, social networks: (1) users' privacy, and the vulnerability of a network to user identification, and (2) dynamics, or the evolution of the network over time. For this purpose, we base our contributions on a central tool in the study of graphs and complex networks: graph sampling. We conjecture that each network snapshot can be treated as a sample from an underlying network. Using this, a sampling process can be viewed as a way to observe dynamic networks, and to model the similarity of two correlated graphs by assuming that the graphs are samples from an underlying generator graph. We take the thesis in two directions. For the first, we focus on the privacy problem in social networks. There have been hot debates on the extent to which the release of anonymized information to the public can leak personally identifiable information (PII). Recent works have shown methods that are able to infer true user identities, under certain conditions and by relying on side information. Our approach to this problem relies on the graph structure, where we investigate the feasibility of de-anonymizing an unlabeled social network by using the structural similarity to an auxiliary network. We propose a model where the two partially overlapping networks of interest are considered samples of an underlying graph. Using such a model, first, we propose a theoretical framework for the de-anonymization problem, we obtain minimal conditions under which de-anonymization is feasible, and we establish a threshold on the similarity of the two networks above which anonymity could be lost. Then, we propose a novel algorithm based on a Bayesian framework, which is capable of matching two graphs of thousands of nodes - with no side information other than network structures. Our method has several potential applications, e.g., inferring user identities in an anonymized network by using a similar public network, cross-referencing dictionaries of different languages, correlating data from different domains, etc. We also introduce a novel privacy-preserving mechanism for social recommender systems, where users can receive accurate recommendations while hiding their profiles from an untrusted recommender server. For the second direction of this work, we focus on models for network growth, more specifically for network densification, by using a sampling process. The densification phenomenon has been recently observed in various real networks, and we argue that it can be explained simply through the way we observe (sample) the networks. We introduce a process of sampling the edges of a fixed graph, which results in the super-linear growth of edges versus nodes and the increase of the average degree as the network evolves

Infoscience - École polytechnique fédérale de Lausanne

LIPIcs, Volume 248, ISAAC 2022, Complete Volume

Author: Bae Sang Won
Park Heejin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Algorithms and Computation (ISAAC 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 248, ISAAC 2022, Complete Volum

Dagstuhl Research Online Publication Server

SWKM 2008: Social Web and Knowledge Management, Proceedings:CEUR Workshop Proceedings

Author
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2008
Field of study

VBN

A treatment of stereochemistry in computer aided organic synthesis

Author: Cook Anthony Peter Fendick
Publication venue: University of Leeds
Publication date: 01/01/2015
Field of study

This thesis describes the author’s contributions to a new stereochemical processing module constructed for the ARChem retrosynthesis program. The purpose of the module is to add the ability to perform enantioselective and diastereoselective retrosynthetic disconnections and generate appropriate precursor molecules. The module uses evidence based rules generated from a large database of literature reactions. Chapter 1 provides an introduction and critical review of the published body of work for computer aided synthesis design. The role of computer perception of key structural features (rings, functions groups etc.) and the construction and use of reaction transforms for generating precursors is discussed. Emphasis is also given to the application of strategies in retrosynthetic analysis. The availability of large reaction databases has enabled a new generation of retrosynthesis design programs to be developed that use automatically generated transforms assembled from published reactions. A brief description of the transform generation method employed by ARChem is given. Chapter 2 describes the algorithms devised by the author for handling the computer recognition and representation of the stereochemical features found in molecule and reaction scheme diagrams. The approach is generalised and uses flexible recognition patterns to transform information found in chemical diagrams into concise stereo descriptors for computer processing. An algorithm for efficiently comparing and classifying pairs of stereo descriptors is described. This algorithm is central for solving the stereochemical constraints in a variety of substructure matching problems addressed in chapter 3. The concise representation of reactions and transform rules as hyperstructure graphs is described. Chapter 3 is concerned with the efficient and reliable detection of stereochemical symmetry in both molecules, reactions and rules. A novel symmetry perception algorithm, based on a constraints satisfaction problem (CSP) solver, is described. The use of a CSP solver to implement an isomorph‐free matching algorithm for stereochemical substructure matching is detailed. The prime function of this algorithm is to seek out unique retron locations in target molecules and then to generate precursor molecules without duplications due to symmetry. Novel algorithms for classifying asymmetric, pseudo‐asymmetric and symmetric stereocentres; meso, centro, and C2 symmetric molecules; and the stereotopicity of trigonal (sp2) centres are described. Chapter 4 introduces and formalises the annotated structural language used to create both retrosynthetic rules and the patterns used for functional group recognition. A novel functional group recognition package is described along with its use to detect important electronic features such as electron‐withdrawing or donating groups and leaving groups. The functional groups and electronic features are used as constraints in retron rules to improve transform relevance. Chapter 5 details the approach taken to design detailed stereoselective and substrate controlled transforms from organised hierarchies of rules. The rules employ a rich set of constraints annotations that concisely describe the keying retrons. The application of the transforms for collating evidence based scoring parameters from published reaction examples is described. A survey of available reaction databases and the techniques for mining stereoselective reactions is demonstrated. A data mining tool was developed for finding the best reputable stereoselective reaction types for coding as transforms. For various reasons it was not possible during the research period to fully integrate this work with the ARChem program. Instead, Chapter 6 introduces a novel one‐step retrosynthesis module to test the developed transforms. The retrosynthesis algorithms use the organisation of the transform rule hierarchy to efficiently locate the best retron matches using all applicable stereoselective transforms. This module was tested using a small set of selected target molecules and the generated routes were ranked using a series of measured parameters including: stereocentre clearance and bond cleavage; example reputation; estimated stereoselectivity with reliability; and evidence of tolerated functional groups. In addition a method for detecting regioselectivity issues is presented. This work presents a number of algorithms using common set and graph theory operations and notations. Appendix A lists the set theory symbols and meanings. Appendix B summarises and defines the common graph theory terminology used throughout this thesis

White Rose E-theses Online

Recommended from our members

Location Data: Perils, Profits, Promise

Author: Riederer Christopher James
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Most of the modern online economy is based on websites offering free services and content in exchange for advertising access and user data. Web companies collect vast troves of data about their users in order to better target their advertisements. An important subset of this harvested data is the locations visited by users. Location data is valuable as it is a ``real world" signal compared to online behaviors: a visit to a store is a stronger signal than a visit to a website, and location data can reveal user attributes that are interesting to advertisers. The collection of this data, however, raises many concerns. Location data can reveal important attributes that users may not wish to disclose: ZIP codes can reveal income and race, visits to places of worship may allow discrimination, and insurers may want to know about trips to hospitals. The risks exist at both an individual level, with location tied to physical safety, and at a collective level, with inference about group membership a necessary step towards discrimination. In this thesis, I examine issues of privacy and fairness in the use of location data. In the first portion, I empirically demonstrate new attacks on the anonymity and privacy of users, including a theoretical basis for user identification. In the second portion, I propose and analyze new solutions for dealing with privacy, anonymity, and fairness in the collection and use of location data. In contrast to previous work which presents privacy in abstract ways or ignores the power of data aggregators, the work presented here focuses on concretely informing users and incorporates the economic incentives driving privacy and fairness concerns

Columbia University Academic Commons

Computational Approaches to Drug Profiling and Drug-Protein Interactions

Author: Scott Oliver B.
Publication venue: UCL (University College London)
Publication date: 28/03/2023
Field of study

Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

UCL Discovery

Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations

Author: Camlibel Mehmet
Tesi Pietro
van Waarde Hendrik
Publication venue
Publication date: 01/01/2017
Field of study

The network structure (or topology) of a dynamical network is often unavailable or uncertain. Hence, we consider the problem of network reconstruction. Network reconstruction aims at inferring the topology of a dynamical network using measurements obtained from the network. In this technical note we define the notion of solvability of the network reconstruction problem. Subsequently, we provide necessary and sufficient conditions under which the network reconstruction problem is solvable. Finally, using constrained Lyapunov equations, we establish novel network reconstruction algorithms, applicable to general dynamical networks. We also provide specialized algorithms for specific network dynamics, such as the well-known consensus and adjacency dynamics.Comment: 8 page

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Control and interaction strategies for self-reconfigurable modular robots

Author: Bonardi Stéphane
Publication venue: Lausanne, EPFL
Publication date: 15/09/2014
Field of study

Infoscience - École polytechnique fédérale de Lausanne