131 research outputs found
On Submodularity and Controllability in Complex Dynamical Networks
Controllability and observability have long been recognized as fundamental
structural properties of dynamical systems, but have recently seen renewed
interest in the context of large, complex networks of dynamical systems. A
basic problem is sensor and actuator placement: choose a subset from a finite
set of possible placements to optimize some real-valued controllability and
observability metrics of the network. Surprisingly little is known about the
structure of such combinatorial optimization problems. In this paper, we show
that several important classes of metrics based on the controllability and
observability Gramians have a strong structural property that allows for either
efficient global optimization or an approximation guarantee by using a simple
greedy heuristic for their maximization. In particular, the mapping from
possible placements to several scalar functions of the associated Gramian is
either a modular or submodular set function. The results are illustrated on
randomly generated systems and on a problem of power electronic actuator
placement in a model of the European power grid.Comment: Original arXiv version of IEEE Transactions on Control of Network
Systems paper (Volume 3, Issue 1), with a addendum (located in the ancillary
documents) that explains an error in a proof of the original paper and
provides a counterexample to the corresponding resul
Recommended from our members
Data Stream Algorithms for Large Graphs and High Dimensional Data
In contrast to the traditional random access memory computational model where the entire input is available in the working memory, the data stream model only provides sequential access to the input. The data stream model is a natural framework to handle large and dynamic data. In this model, we focus on designing algorithms that use sublinear memory and a small number of passes over the stream. Other desirable properties include fast update time, query time, and post processing time.
In this dissertation, we consider different problems in graph theory, combinatorial optimization, and high dimensional data processing.
The first part of this dissertation focuses on algorithms for graph theory and combinatorial optimization. We present new results for the problems of finding the densest subgraph, counting the number of triangles, finding max cut with bounded components, and finding the maximum set coverage.
The second part of this dissertation considers problems in high dimensional data streams. In this setting, each stream item consists of multiple coordinates corresponding to different attributes. We consider the problem of testing or learning about the relationships among the attributes, and the problem of finding heavy hitters in subsets of attributes
Recommended from our members
Optimization and revenue management in complex networks
This thesis consists of three papers in optimization and revenue management over complex networks: Robust Linear Control in Transmission Systems, Online Learning and Optimization Under a New Linear-Threshold Model with Negative Influence, and Revenue Management with Complementarity Products. This thesis contributes to analytical methods for optimization problems in complex networks, namely, power network, social network and product network.
In Chapter 2, we describe a robust multiperiod transmission planning model including renewables and batteries, where battery output is used to partly offset renewable output deviations from forecast. A central element is a nonconvex battery operation model plus a robust model of forecast errors and a linear control scheme. Even though the problem is nonconvex we provide an efficient and theoretically valid algorithm that effectively solves cases on large transmission systems.
In Chapter 3, we propose a new class of Linear Threshold Model-based information-diffusion model that incorporates the formation and spread of negative attitude. We call such models negativity-aware. We show that in these models, the expected positive influence is a monotone sub-modular function of the seed set. Thus we can use a greedy algorithm to construct a solution with constant approximation guarantee when the objective is to select a seed set of fixed size to maximize positive influence. Our models are flexible enough to account for both the features of local users and the features of the information being propagated in the diffusion. We analyze an online-learning setting for a multi-round influence-maximization problem, where an agent is actively learning the diffusion parameters over time while trying to maximize total cumulative positive influence. We develop a class of online learning algorithms and provide the theoretical upper bound on the regret.
In Chapter 4, we propose a tractable information-diffusion-based framework to capture complementary relationships among products. Using this framework, we investigate how various revenue-management decisions can be optimized. In particular, we prove that several fundamental problems involving complementary products, such as promotional pricing, product recommendation, and category planning, can be formulated as sub-modular maximization problems, and can be solved by tractable greedy algorithms with guarantees on the quality of the solutions. We validate our model using a dataset that contains product reviews and metadata from Amazon from May 1996 to July 2014.
We also analyze an online-learning setting for revenue-maximization with complementary products. In this setting, we assume that the retailer has access only to sales observations. That is, she can only observe whether a product is purchased from her. This assumption leads to diffusion models with novel node-level feedback, in contrast to classical models that have edge-level feedback. We conduct confidence region analysis on the maximum likelihood estimator for our models, develop online-learning algorithms, and analyze their performance in both theoretical and practical perspectives
Privacy Preserving Data Publishing
Recent years have witnessed increasing interest among researchers in protecting individual privacy in the big data era, involving social media, genomics, and Internet of Things. Recent studies have revealed numerous privacy threats and privacy protection methodologies, that vary across a broad range of applications. To date, however, there exists no powerful methodologies in addressing challenges from: high-dimension data, high-correlation data and powerful attackers.
In this dissertation, two critical problems will be investigated: the prospects and some challenges for elucidating the attack capabilities of attackers in mining individuals’ private information; and methodologies that can be used to protect against such inference attacks, while guaranteeing significant data utility.
First, this dissertation has proposed a series of works regarding inference attacks laying emphasis on protecting against powerful adversaries with auxiliary information. In the context of genomic data, data dimensions and computation feasibility is highly challenging in conducting data analysis. This dissertation proved that the proposed attack can effectively infer the values of the unknown SNPs and traits in linear complexity, which dramatically improve the computation cost compared with traditional methods with exponential computation cost.
Second, putting differential privacy guarantee into high-dimension and high-correlation data remains a challenging problem, due to high-sensitivity, output scalability and signal-to-noise ratio. Consider there are tens-of-millions of genomes in a human DNA, it is infeasible for traditional methods to introduce noise to sanitize genomic data. This dissertation has proposed a series of works and demonstrated that the proposed differentially private method satisfies differential privacy; moreover, data utility is improved compared with the states of the arts by largely lowering data sensitivity.
Third, putting privacy guarantee into social data publishing remains a challenging problem, due to tradeoff requirements between data privacy and utility. This dissertation has proposed a series of works and demonstrated that the proposed methods can effectively realize privacy-utility tradeoff in data publishing.
Finally, two future research topics are proposed. The first topic is about Privacy Preserving Data Collection and Processing for Internet of Things. The second topic is to study Privacy Preserving Big Data Aggregation. They are motivated by the newly proposed data mining, artificial intelligence and cybersecurity methods
A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization
Influence Maximization (IM) is a classical combinatorial optimization
problem, which can be widely used in mobile networks, social computing, and
recommendation systems. It aims at selecting a small number of users such that
maximizing the influence spread across the online social network. Because of
its potential commercial and academic value, there are a lot of researchers
focusing on studying the IM problem from different perspectives. The main
challenge comes from the NP-hardness of the IM problem and \#P-hardness of
estimating the influence spread, thus traditional algorithms for overcoming
them can be categorized into two classes: heuristic algorithms and
approximation algorithms. However, there is no theoretical guarantee for
heuristic algorithms, and the theoretical design is close to the limit.
Therefore, it is almost impossible to further optimize and improve their
performance. With the rapid development of artificial intelligence, the
technology based on Machine Learning (ML) has achieved remarkable achievements
in many fields. In view of this, in recent years, a number of new methods have
emerged to solve combinatorial optimization problems by using ML-based
techniques. These methods have the advantages of fast solving speed and strong
generalization ability to unknown graphs, which provide a brand-new direction
for solving combinatorial optimization problems. Therefore, we abandon the
traditional algorithms based on iterative search and review the recent
development of ML-based methods, especially Deep Reinforcement Learning, to
solve the IM problem and other variants in social networks. We focus on
summarizing the relevant background knowledge, basic principles, common
methods, and applied research. Finally, the challenges that need to be solved
urgently in future IM research are pointed out.Comment: 45 page
High Dimensional Learning with Structure Inducing Constraints and Regularizers
University of Minnesota Ph.D. dissertation. August 2017. Major: Computer Science. Advisor: Arindam Banerjee. 1 computer file (PDF); ix, 127 pages.Explosive growth in data generation through science and technology calls for new computational and analytical tools. To the statistical machine learning community, one major challenge is the data sets with dimensions larger than the number of samples. Low sample-high dimension regime violates the core assumption of most traditional learning methods. To address this new challenge, over the past decade many high-dimensional learning algorithms have been developed. One of the significant high-dimensional problems in machine learning is the linear regression where the number of features is greater than the number of samples. In the beginning, the primary focus of high-dimensional linear regression literature was on estimating sparse coefficient through -norm regularization. In a more general framework, one can assume that the underlying parameter has an intrinsic ``low dimensional complexity'' or \emph{structure}. Recently, researchers have looked at structures beyond sparsity that are induced by \emph{any norm} as the regularizer or constraint. In this thesis, we focus on two variants of the high-dimensional linear model, i.e., data sharing and errors-in-variables where the structure of the parameter is captured with a suitable norm. We introduce estimators for these models and study their theoretical properties. We characterize the sample complexity of our estimators and establish non-asymptotic high probability error bounds for them. Finally, we utilize dictionary learning and sparse coding to perform Twitter sentiment analysis as an application of high dimensional learning. Some discrete machine learning problems can also be posed as constrained set function optimization, where the constraints induce a structure over the solution set. In the second part of the thesis, we investigate a prominent set function optimization problem, the social influence maximization, under the novel ``heat conduction'' influence propagation model. We formulate the problem as a submodular maximization with cardinality constraints and provide an efficient algorithm for it. Through extensive experiments on several large real and synthetic networks, we show that our algorithm outperforms the well-studied methods from influence maximization literature
- …