276 research outputs found
The Research on Consumer Preferences of Dairy Products in China -The comparison between inside and outside Guangdong Province-
平成29年度修士論文要
Solving Regularized Exp, Cosh and Sinh Regression Problems
In modern machine learning, attention computation is a fundamental task for
training large language models such as Transformer, GPT-4 and ChatGPT. In this
work, we study exponential regression problem which is inspired by the
softmax/exp unit in the attention mechanism in large language models. The
standard exponential regression is non-convex. We study the regularization
version of exponential regression problem which is a convex problem. We use
approximate newton method to solve in input sparsity time.
Formally, in this problem, one is given matrix , , and any of functions and denoted as . The goal is to find the optimal that
minimize . The
straightforward method is to use the naive Newton's method. Let
denote the number of non-zeros entries in matrix . Let
denote the exponent of matrix multiplication. Currently, . Let denote the accuracy error. In this paper, we
make use of the input sparsity and purpose an algorithm that use iterations and per iteration time to solve the problem
Domain-decomposed Bayesian inversion based on local Karhunen-Loève expansions
In many Bayesian inverse problems the goal is to recover a spatially varying random field. Such problems are often computationally challenging especially when the forward model is governed by complex partial differential equations (PDEs). The challenge is particularly severe when the spatial domain is large and the unknown random field needs to be represented by a high-dimensional parameter. In this paper, we present a domain-decomposed method to attack the dimensionality issue and the method decomposes the spatial domain and the parameter domain simultaneously. On each subdomain, a local Karhunen-Loève (KL) expansion is constructed, and a local inversion problem is solved independently in a parallel manner, and more importantly, in a lower-dimensional space. After local posterior samples are generated through conducting Markov chain Monte Carlo (MCMC) simulations on subdomains, a novel projection procedure is developed to effectively reconstruct the global field. In addition, the domain decomposition interface conditions are dealt with an adaptive Gaussian process-based fitting strategy. Numerical examples are provided to demonstrate the performance of the proposed method
Domain-decomposed Bayesian inversion based on local Karhunen-Lo\`{e}ve expansions
In many Bayesian inverse problems the goal is to recover a spatially varying
random field. Such problems are often computationally challenging especially
when the forward model is governed by complex partial differential equations
(PDEs). The challenge is particularly severe when the spatial domain is large
and the unknown random field needs to be represented by a high-dimensional
parameter. In this paper, we present a domain-decomposed method to attack the
dimensionality issue and the method decomposes the spatial domain and the
parameter domain simultaneously. On each subdomain, a local Karhunen-Lo`eve
(KL) expansion is constructed, and a local inversion problem is solved
independently in a parallel manner, and more importantly, in a
lower-dimensional space. After local posterior samples are generated through
conducting Markov chain Monte Carlo (MCMC) simulations on subdomains, a novel
projection procedure is developed to effectively reconstruct the global field.
In addition, the domain decomposition interface conditions are dealt with an
adaptive Gaussian process-based fitting strategy. Numerical examples are
provided to demonstrate the performance of the proposed method
Attention Scheme Inspired Softmax Regression
Large language models (LLMs) have made transformed changes for human society.
One of the key computation in LLMs is the softmax unit. This operation is
important in LLMs because it allows the model to generate a distribution over
possible next words or phrases, given a sequence of input words. This
distribution is then used to select the most likely next word or phrase, based
on the probabilities assigned by the model. The softmax unit plays a crucial
role in training LLMs, as it allows the model to learn from the data by
adjusting the weights and biases of the neural network.
In the area of convex optimization such as using central path method to solve
linear programming. The softmax function has been used a crucial tool for
controlling the progress and stability of potential function [Cohen, Lee and
Song STOC 2019, Brand SODA 2020].
In this work, inspired the softmax unit, we define a softmax regression
problem. Formally speaking, given a matrix and
a vector , the goal is to use greedy type algorithm to
solve \begin{align*} \min_{x} \| \langle \exp(Ax), {\bf 1}_n \rangle^{-1}
\exp(Ax) - b \|_2^2. \end{align*} In certain sense, our provable convergence
result provides theoretical support for why we can use greedy algorithm to
train softmax function in practice
Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression
There have been significant advancements made by large language models (LLMs)
in various aspects of our daily lives. LLMs serve as a transformative force in
natural language processing, finding applications in text generation,
translation, sentiment analysis, and question-answering. The accomplishments of
LLMs have led to a substantial increase in research efforts in this domain. One
specific two-layer regression problem has been well-studied in prior works,
where the first layer is activated by a ReLU unit, and the second layer is
activated by a softmax unit. While previous works provide a solid analysis of
building a two-layer regression, there is still a gap in the analysis of
constructing regression problems with more than two layers.
In this paper, we take a crucial step toward addressing this problem: we
provide an analysis of a two-layer regression problem. In contrast to previous
works, our first layer is activated by a softmax unit. This sets the stage for
future analyses of creating more activation functions based on the softmax
function. Rearranging the softmax function leads to significantly different
analyses. Our main results involve analyzing the convergence properties of an
approximate Newton method used to minimize the regularized training loss. We
prove that the loss function for the Hessian matrix is positive definite and
Lipschitz continuous under certain assumptions. This enables us to establish
local convergence guarantees for the proposed training algorithm. Specifically,
with an appropriate initialization and after iterations,
our algorithm can find an -approximate minimizer of the training loss
with high probability. Each iteration requires approximately time, where is the model size, is the input matrix, and
is the matrix multiplication exponent
Fooling Polarization-based Vision using Locally Controllable Polarizing Projection
Polarization is a fundamental property of light that encodes abundant
information regarding surface shape, material, illumination and viewing
geometry. The computer vision community has witnessed a blossom of
polarization-based vision applications, such as reflection removal,
shape-from-polarization, transparent object segmentation and color constancy,
partially due to the emergence of single-chip mono/color polarization sensors
that make polarization data acquisition easier than ever. However, is
polarization-based vision vulnerable to adversarial attacks? If so, is that
possible to realize these adversarial attacks in the physical world, without
being perceived by human eyes? In this paper, we warn the community of the
vulnerability of polarization-based vision, which can be more serious than
RGB-based vision. By adapting a commercial LCD projector, we achieve locally
controllable polarizing projection, which is successfully utilized to fool
state-of-the-art polarization-based vision algorithms for glass segmentation
and color constancy. Compared with existing physical attacks on RGB-based
vision, which always suffer from the trade-off between attack efficacy and eye
conceivability, the adversarial attackers based on polarizing projection are
contact-free and visually imperceptible, since naked human eyes can rarely
perceive the difference of viciously manipulated polarizing light and ordinary
illumination. This poses unprecedented risks on polarization-based vision, both
in the monochromatic and trichromatic domain, for which due attentions should
be paid and counter measures be considered
Serial Dependence in Dermatological Judgments
This research was funded by the National Institutes of Health (NIH) grant number R01CA236793.Peer reviewedPublisher PD
- …