236 research outputs found
Lower bounds to randomized algorithms for graph properties
AbstractFor any property P on n-vertex graphs, let C(P) be the minimum number of edges needed to be examined by any decision tree algorithm for determining P. In 1975 Rivest and Vuillemin settled the Aanderra-Rosenberg Conjecture, proving that C(P)=Ω(n2) for every nontrivial monotone graph property P. An intriguing open question is whether the theorem remains true when randomized algorithms are allowed. In this paper we show that Ω(n(log n)112 edges need to be examined by any randomized algorithm for determining any nontrivial monotone graph property
Quantum replication at the Heisenberg limit
No process in nature can perfectly clone an arbitrary quantum state. But is
it possible to engineer processes that replicate quantum information with
vanishingly small error? Here we demonstrate the possibility of probabilistic
super-replication phenomena where N equally prepared quantum clocks are
transformed into a much larger number of M nearly perfect replicas, with an
error that rapidly vanishes whenever M is small compared to the square of N.
The quadratic replication rate is the ultimate limit imposed by Quantum
Mechanics to the proliferation of information and is fundamentally linked with
the Heisenberg limit of quantum metrology.Comment: 9 + 16 pages, 2 figures, published versio
Credible, Truthful, and Two-Round (Optimal) Auctions via Cryptographic Commitments
We consider the sale of a single item to multiple buyers by a
revenue-maximizing seller. Recent work of Akbarpour and Li formalizes
\emph{credibility} as an auction desideratum, and prove that the only optimal,
credible, strategyproof auction is the ascending price auction with reserves
(Akbarpour and Li, 2019).
In contrast, when buyers' valuations are MHR, we show that the mild
additional assumption of a cryptographically secure commitment scheme suffices
for a simple \emph{two-round} auction which is optimal, strategyproof, and
credible (even when the number of bidders is only known by the auctioneer).
We extend our analysis to the case when buyer valuations are
-strongly regular for any , up to arbitrary
in credibility. Interestingly, we also prove that this construction cannot be
extended to regular distributions, nor can the be removed with
multiple bidders
Meta Prompting for AI Systems
In this work, we present a comprehensive study of Meta Prompting (MP), an
innovative technique reshaping the utilization of language models (LMs) and AI
systems in problem-solving and data interaction. Grounded in type theory and
category theory, Meta Prompting emphasizes the structure and syntax of
information over traditional content-centric methods. The paper explores the
formal definitions of Meta Prompting, sets it apart from few-shot prompting,
and underlines its effectiveness in various AI applications. A key focus is
applying Meta Prompting for complex reasoning tasks, showing how it effectively
deconstructs intricate problems into simpler sub-problems, enhancing token
efficiency, and enabling more equitable problem-solving comparisons, especially
against few-shot prompting methods. Additionally, the paper introduces Meta
Prompting for prompting tasks, allowing LLMs to self-generate new prompts in a
recursive, metaprogramming-like manner. Empirical experiments, including using
a Qwen-72B base language model equipped with meta prompt without
instruction-tuning to solve MATH problems with accuracy at 46.3%, which surpass
the supervised fine-tuned counterpart trained with extensive mathematical QA
instruction pairs and even the initial version of GPT-4, solving GSM8K problems
with 83.5% accuracy with zero-shot meta-prompted Qwen-72B base language model,
and solving the Game of 24 tasks with a 100% success rate using GPT-4,
demonstrate the meta prompting's efficacy in achieving high accuracy and
efficiency, showcasing Meta Prompting's transformative impact on AI
problem-solving The code is available at
https://github.com/meta-prompting/meta-prompting
Autonomous Data Selection with Language Models for Mathematical Texts
To improve language models' proficiency in mathematical reasoning via
continual pretraining, we introduce a novel strategy that leverages base
language models for autonomous data selection. Departing from conventional
supervised fine-tuning or trained classifiers with human-annotated data, our
approach Autonomous Data Selection (AutoDS) utilizes meta-prompted language
models as zero-shot verifiers to evaluate and select high-quality mathematical
content autonomously. To demonstrate the efficacy of our method, we
continuously pretrained a 7B-parameter language model on our curated dataset,
achieving substantial improvements in downstream performance on the MATH,
GSM8K, and BIG-Bench Hard (BBH) tasks with a token amount reduced by orders of
magnitude compared to previous continual pretraining works. Our method
showcases a 2 times increase in pretraining token efficiency compared to
state-of-the-art baselines, underscoring the potential of our approach in
enhancing models' mathematical reasoning capabilities. The AutoMathText dataset
is available at https://huggingface.co/datasets/math-ai/AutoMathText. The code
is available at https://github.com/yifanzhang-pro/AutoMathText
Cumulative Reasoning with Large Language Models
Despite the recent advancements in language models (LMs), their ability to
solve complex problems remains limited. This paper introduces Cumulative
Reasoning (CR), a novel approach that utilizes LMs cumulatively and
iteratively, mirroring human thought processes for problem-solving. CR
decomposes tasks into smaller, manageable components and leverages previous
propositions for effective composition, significantly enhancing problem-solving
capabilities. We demonstrate CR's superiority through several complex reasoning
tasks: it outperforms existing methods in logical inference tasks with up to a
9.3% improvement, achieving 98.04% accuracy on the curated FOLIO wiki dataset.
In the Game of 24, it achieves 98% accuracy, marking a 24% improvement over the
prior state-of-the-art. Additionally, CR sets new state-of-the-art on the MATH
dataset, achieving a 4.2% increase from previous methods and a 43% relative
improvement in the most challenging problems. By extending CR to incorporate a
code environment without external aids like retrieval or web browsing, we
further harness the computational and logical reasoning capabilities of LMs,
achieving a remarkable 72.2% accuracy on the MATH dataset and outperforming the
PAL/PoT method by 38.8%. Our work not only sets new state-of-the-art but also
paves the way toward more sophisticated AI reasoning methods. The code is
available at https://github.com/iiis-ai/cumulative-reasoning
PrivacyFL: A simulator for privacy-preserving and secure federated learning
Federated learning is a technique that enables distributed clients to
collaboratively learn a shared machine learning model while keeping their
training data localized. This reduces data privacy risks, however, privacy
concerns still exist since it is possible to leak information about the
training dataset from the trained model's weights or parameters. Setting up a
federated learning environment, especially with security and privacy
guarantees, is a time-consuming process with numerous configurations and
parameters that can be manipulated. In order to help clients ensure that
collaboration is feasible and to check that it improves their model accuracy, a
real-world simulator for privacy-preserving and secure federated learning is
required. In this paper, we introduce PrivacyFL, which is an extensible, easily
configurable and scalable simulator for federated learning environments. Its
key features include latency simulation, robustness to client departure,
support for both centralized and decentralized learning, and configurable
privacy and security mechanisms based on differential privacy and secure
multiparty computation. In this paper, we motivate our research, describe the
architecture of the simulator and associated protocols, and discuss its
evaluation in numerous scenarios that highlight its wide range of functionality
and its advantages. Our paper addresses a significant real-world problem:
checking the feasibility of participating in a federated learning environment
under a variety of circumstances. It also has a strong practical impact because
organizations such as hospitals, banks, and research institutes, which have
large amounts of sensitive data and would like to collaborate, would greatly
benefit from having a system that enables them to do so in a privacy-preserving
and secure manner
- …