4 research outputs found
Dynamic Large Language Models on Blockchains
Training and deploying the large language models requires a large mount of
computational resource because the language models contain billions of
parameters and the text has thousands of tokens. Another problem is that the
large language models are static. They are fixed after the training process. To
tackle these issues, in this paper, we propose to train and deploy the dynamic
large language model on blockchains, which have high computation performance
and are distributed across a network of computers. A blockchain is a secure,
decentralized, and transparent system that allows for the creation of a
tamper-proof ledger for transactions without the need for intermediaries. The
dynamic large language models can continuously learn from the user input after
the training process. Our method provides a new way to develop the large
language models and also sheds a light on the next generation artificial
intelligence systems
PLMM: Personal Large Models on Mobile Devices
Inspired by Federated Learning, in this paper, we propose personal large
models that are distilled from traditional large language models but more
adaptive to local users' personal information such as education background and
hobbies. We classify the large language models into three levels: the personal
level, expert level and traditional level. The personal level models are
adaptive to users' personal information. They encrypt the users' input and
protect their privacy. The expert level models focus on merging specific
knowledge such as finance, IT and art. The traditional models focus on the
universal knowledge discovery and upgrading the expert models. In such
classifications, the personal models directly interact with the user. For the
whole system, the personal models have users' (encrypted) personal information.
Moreover, such models must be small enough to be performed on personal
computers or mobile devices. Finally, they also have to response in real-time
for better user experience and produce high quality results. The proposed
personal large models can be applied in a wide range of applications such as
language and vision tasks.Comment: arXiv admin note: substantial text overlap with arXiv:2307.1322
Multilevel Large Language Models for Everyone
Large language models have made significant progress in the past few years.
However, they are either generic {\it or} field specific, splitting the
community into different groups. In this paper, we unify these large language
models into a larger map, where the generic {\it and} specific models are
linked together and can improve each other, based on the user personal input
and information from the internet. The idea of linking several large language
models together is inspired by the functionality of human brain. The specific
regions on the brain cortex are specific for certain low level functionality.
And these regions can jointly work together to achieve more complex high level
functionality. Such behavior on human brain cortex sheds the light to design
the multilevel large language models that contain global level, field level and
user level models. The user level models run on local machines to achieve
efficient response and protect the user's privacy. Such multilevel models
reduce some redundancy and perform better than the single level models. The
proposed multilevel idea can be applied in various applications, such as
natural language processing, computer vision tasks, professional assistant,
business and healthcare
Gradient Domain Diffusion Models for Image Synthesis
Diffusion models are getting popular in generative image and video synthesis.
However, due to the diffusion process, they require a large number of steps to
converge. To tackle this issue, in this paper, we propose to perform the
diffusion process in the gradient domain, where the convergence becomes faster.
There are two reasons. First, thanks to the Poisson equation, the gradient
domain is mathematically equivalent to the original image domain. Therefore,
each diffusion step in the image domain has a unique corresponding gradient
domain representation. Second, the gradient domain is much sparser than the
image domain. As a result, gradient domain diffusion models converge faster.
Several numerical experiments confirm that the gradient domain diffusion models
are more efficient than the original diffusion models. The proposed method can
be applied in a wide range of applications such as image processing, computer
vision and machine learning tasks