Search CORE

22 research outputs found

Learning Set Functions that are Sparse in Non-Orthogonal Fourier Bases

Author: Amrollahi Andisheh
Krause Andreas
Püschel Markus
Seifert Bastian
Wendler Chris
Publication venue
Publication date: 29/03/2021
Field of study

Many applications of machine learning on discrete domains, such as learning preference functions in recommender systems or auctions, can be reduced to estimating a set function that is sparse in the Fourier domain. In this work, we present a new family of algorithms for learning Fourier-sparse set functions. They require at most

nk - k \log_2 k + k

queries (set function evaluations), under mild conditions on the Fourier coefficients, where

n

is the size of the ground set and

k

the number of non-zero Fourier coefficients. In contrast to other work that focused on the orthogonal Walsh-Hadamard transform, our novel algorithms operate with recently introduced non-orthogonal Fourier transforms that offer different notions of Fourier-sparsity. These naturally arise when modeling, e.g., sets of items forming substitutes and complements. We demonstrate effectiveness on several real-world applications

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Fourier Analysis-based Iterative Combinatorial Auctions

Author: Lubin Ben
Püschel Markus
Seuken Sven
Weissteiner Jakob
Wendler Chris
Publication venue
Publication date: 30/09/2021
Field of study

Recent advances in Fourier analysis have brought new tools to efficiently represent and learn set functions. In this paper, we bring the power of Fourier analysis to the design of combinatorial auctions (CAs). The key idea is to approximate bidders' value functions using Fourier-sparse set functions, which can be computed using a relatively small number of queries. Since this number is still too large for real-world CAs, we propose a new hybrid design: we first use neural networks to learn bidders' values and then apply Fourier analysis to the learned representations. On a technical level, we formulate a Fourier transform-based winner determination problem and derive its mixed integer program formulation. Based on this, we devise an iterative CA that asks Fourier-based queries. We experimentally show that our hybrid ICA achieves higher efficiency than prior auction designs, leads to a fairer distribution of social welfare, and significantly reduces runtime. With this paper, we are the first to leverage Fourier analysis in CA design and lay the foundation for future work in this area

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

ZORA

Ensemble Analysis of Adaptive Compressed Genome Sequencing Strategies

Author: Taghavi Zeinab
Publication venue
Publication date: 25/04/2014
Field of study

Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity. In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to

\log

of the number of cells and proportional linearly with the number of distinct genomes

arXiv.org e-Print Archive

Springer - Publisher Connector

Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions

Author: Bilmes Jeff
Iyer Rishabh
Jegelka Stefanie
Publication venue
Publication date: 08/11/2013
Field of study

We investigate three related and important problems connected to machine learning: approximating a submodular function everywhere, learning a submodular function (in a PAC-like setting [53]), and constrained minimization of submodular functions. We show that the complexity of all three problems depends on the 'curvature' of the submodular function, and provide lower and upper bounds that refine and improve previous results [3, 16, 18, 52]. Our proof techniques are fairly generic. We either use a black-box transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization). Curiously, curvature has been known to influence approximations for submodular maximization [7, 55], but its effect on minimization, approximation and learning has hitherto been open. We complete this picture, and also support our theoretical claims by empirical results.Comment: 21 pages. A shorter version appeared in Advances of NIPS-201

arXiv.org e-Print Archive

CiteSeerX

Targeted Undersmoothing

Author: Hansen Christian
Kozbur Damian
Misra Sanjog
Publication venue
Publication date: 01/06/2017
Field of study

This paper proposes a post-model selection inference procedure, called targeted undersmoothing, designed to construct uniformly valid confidence sets for a broad class of functionals of sparse high-dimensional statistical models. These include dense functionals, which may potentially depend on all elements of an unknown high-dimensional parameter. The proposed confidence sets are based on an initially selected model and two additionally selected models, an upper model and a lower model, which enlarge the initially selected model. We illustrate application of the procedure in two empirical examples. The first example considers estimation of heterogeneous treatment effects using data from the Job Training Partnership Act of 1982, and the second example looks at estimating profitability from a mailing strategy based on estimated heterogeneous treatment effects in a direct mail marketing campaign. We also provide evidence on the finite sample performance of the proposed targeted undersmoothing procedure through a series of simulation experiments

arXiv.org e-Print Archive

ZORA