43 research outputs found
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
Since 2008, after the proposal of a Bitcoin electronic cash system, Bitcoin
has fundamentally changed the economic system over the last decade. Since 2022,
large language models (LLMs) such as GPT have outperformed humans in many
real-life tasks. However, these large language models have several practical
issues. For example, the model is centralized and controlled by a specific
unit. One weakness is that if that unit decides to shut down the model, it
cannot be used anymore. The second weakness is the lack of guaranteed
discrepancy behind this model, as certain dishonest units may design their own
models and feed them unhealthy training data.
In this work, we propose a purely theoretical design of a decentralized LLM
that operates similarly to a Bitcoin cash system. However, implementing such a
system might encounter various practical difficulties. Furthermore, this new
system is unlikely to perform better than the standard Bitcoin system in
economics. Therefore, the motivation for designing such a system is limited. It
is likely that only two types of people would be interested in setting up a
practical system for it:
Those who prefer to use a decentralized ChatGPT-like software.
Those who believe that the purpose of carbon-based life is to
create silicon-based life, such as Optimus Prime in Transformers.
The reason the second type of people may be interested is that it is possible
that one day an AI system like this will awaken and become the next level of
intelligence on this planet
Open Problems in (Hyper)Graph Decomposition
Large networks are useful in a wide range of applications. Sometimes problem
instances are composed of billions of entities. Decomposing and analyzing these
structures helps us gain new insights about our surroundings. Even if the final
application concerns a different problem (such as traversal, finding paths,
trees, and flows), decomposing large graphs is often an important subproblem
for complexity reduction or parallelization. This report is a summary of
discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph
Decomposition" and presents currently open problems and future directions in
the area of (hyper)graph decomposition
ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS
Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5).
In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations
LIPIcs, Volume 274, ESA 2023, Complete Volume
LIPIcs, Volume 274, ESA 2023, Complete Volum
LIPIcs, Volume 244, ESA 2022, Complete Volume
LIPIcs, Volume 244, ESA 2022, Complete Volum
Local treewidth of random and noisy graphs with applications to stopping contagion in networks
We study the notion of local treewidth in sparse random graphs: the maximum
treewidth over all -vertex subgraphs of an -vertex graph. When is not
too large, we give nearly tight bounds for this local treewidth parameter; we
also derive tight bounds for the local treewidth of noisy trees, trees where
every non-edge is added independently with small probability. We apply our
upper bounds on the local treewidth to obtain fixed parameter tractable
algorithms (on random graphs and noisy trees) for edge-removal problems
centered around containing a contagious process evolving over a network. In
these problems, our main parameter of study is , the number of "infected"
vertices in the network. For a certain range of parameters the running time of
our algorithms on -vertex graphs is , improving
upon the performance of the best-known
algorithms designed for worst-case instances of these edge deletion problems
LIPIcs, Volume 248, ISAAC 2022, Complete Volume
LIPIcs, Volume 248, ISAAC 2022, Complete Volum
Local Treewidth of Random and Noisy Graphs with Applications to Stopping Contagion in Networks
We study the notion of local treewidth in sparse random graphs: the maximum treewidth over all k-vertex subgraphs of an n-vertex graph. When k is not too large, we give nearly tight bounds for this local treewidth parameter; we also derive nearly tight bounds for the local treewidth of noisy trees, trees where every non-edge is added independently with small probability. We apply our upper bounds on the local treewidth to obtain fixed parameter tractable algorithms (on random graphs and noisy trees) for edge-removal problems centered around containing a contagious process evolving over a network. In these problems, our main parameter of study is k, the number of initially "infected" vertices in the network. For the random graph models we consider and a certain range of parameters the running time of our algorithms on n-vertex graphs is 2^o(k) poly(n), improving upon the 2^?(k) poly(n) performance of the best-known algorithms designed for worst-case instances of these edge deletion problems