232 research outputs found

    The Research on Consumer Preferences of Dairy Products in China -The comparison between inside and outside Guangdong Province-

    Get PDF
    平成29年度修士論文要

    Solving Regularized Exp, Cosh and Sinh Regression Problems

    Full text link
    In modern machine learning, attention computation is a fundamental task for training large language models such as Transformer, GPT-4 and ChatGPT. In this work, we study exponential regression problem which is inspired by the softmax/exp unit in the attention mechanism in large language models. The standard exponential regression is non-convex. We study the regularization version of exponential regression problem which is a convex problem. We use approximate newton method to solve in input sparsity time. Formally, in this problem, one is given matrix ARn×dA \in \mathbb{R}^{n \times d}, bRnb \in \mathbb{R}^n, wRnw \in \mathbb{R}^n and any of functions exp,cosh\exp, \cosh and sinh\sinh denoted as ff. The goal is to find the optimal xx that minimize 0.5f(Ax)b22+0.5diag(w)Ax22 0.5 \| f(Ax) - b \|_2^2 + 0.5 \| \mathrm{diag}(w) A x \|_2^2. The straightforward method is to use the naive Newton's method. Let nnz(A)\mathrm{nnz}(A) denote the number of non-zeros entries in matrix AA. Let ω\omega denote the exponent of matrix multiplication. Currently, ω2.373\omega \approx 2.373. Let ϵ\epsilon denote the accuracy error. In this paper, we make use of the input sparsity and purpose an algorithm that use log(x0x2/ϵ)\log ( \|x_0 - x^*\|_2 / \epsilon) iterations and O~(nnz(A)+dω)\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} ) per iteration time to solve the problem

    Domain-decomposed Bayesian inversion based on local Karhunen-Loève expansions

    Get PDF
    In many Bayesian inverse problems the goal is to recover a spatially varying random field. Such problems are often computationally challenging especially when the forward model is governed by complex partial differential equations (PDEs). The challenge is particularly severe when the spatial domain is large and the unknown random field needs to be represented by a high-dimensional parameter. In this paper, we present a domain-decomposed method to attack the dimensionality issue and the method decomposes the spatial domain and the parameter domain simultaneously. On each subdomain, a local Karhunen-Loève (KL) expansion is constructed, and a local inversion problem is solved independently in a parallel manner, and more importantly, in a lower-dimensional space. After local posterior samples are generated through conducting Markov chain Monte Carlo (MCMC) simulations on subdomains, a novel projection procedure is developed to effectively reconstruct the global field. In addition, the domain decomposition interface conditions are dealt with an adaptive Gaussian process-based fitting strategy. Numerical examples are provided to demonstrate the performance of the proposed method

    Domain-decomposed Bayesian inversion based on local Karhunen-Lo\`{e}ve expansions

    Full text link
    In many Bayesian inverse problems the goal is to recover a spatially varying random field. Such problems are often computationally challenging especially when the forward model is governed by complex partial differential equations (PDEs). The challenge is particularly severe when the spatial domain is large and the unknown random field needs to be represented by a high-dimensional parameter. In this paper, we present a domain-decomposed method to attack the dimensionality issue and the method decomposes the spatial domain and the parameter domain simultaneously. On each subdomain, a local Karhunen-Lo`eve (KL) expansion is constructed, and a local inversion problem is solved independently in a parallel manner, and more importantly, in a lower-dimensional space. After local posterior samples are generated through conducting Markov chain Monte Carlo (MCMC) simulations on subdomains, a novel projection procedure is developed to effectively reconstruct the global field. In addition, the domain decomposition interface conditions are dealt with an adaptive Gaussian process-based fitting strategy. Numerical examples are provided to demonstrate the performance of the proposed method

    Attention Scheme Inspired Softmax Regression

    Full text link
    Large language models (LLMs) have made transformed changes for human society. One of the key computation in LLMs is the softmax unit. This operation is important in LLMs because it allows the model to generate a distribution over possible next words or phrases, given a sequence of input words. This distribution is then used to select the most likely next word or phrase, based on the probabilities assigned by the model. The softmax unit plays a crucial role in training LLMs, as it allows the model to learn from the data by adjusting the weights and biases of the neural network. In the area of convex optimization such as using central path method to solve linear programming. The softmax function has been used a crucial tool for controlling the progress and stability of potential function [Cohen, Lee and Song STOC 2019, Brand SODA 2020]. In this work, inspired the softmax unit, we define a softmax regression problem. Formally speaking, given a matrix ARn×dA \in \mathbb{R}^{n \times d} and a vector bRnb \in \mathbb{R}^n, the goal is to use greedy type algorithm to solve \begin{align*} \min_{x} \| \langle \exp(Ax), {\bf 1}_n \rangle^{-1} \exp(Ax) - b \|_2^2. \end{align*} In certain sense, our provable convergence result provides theoretical support for why we can use greedy algorithm to train softmax function in practice

    Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

    Full text link
    There have been significant advancements made by large language models (LLMs) in various aspects of our daily lives. LLMs serve as a transformative force in natural language processing, finding applications in text generation, translation, sentiment analysis, and question-answering. The accomplishments of LLMs have led to a substantial increase in research efforts in this domain. One specific two-layer regression problem has been well-studied in prior works, where the first layer is activated by a ReLU unit, and the second layer is activated by a softmax unit. While previous works provide a solid analysis of building a two-layer regression, there is still a gap in the analysis of constructing regression problems with more than two layers. In this paper, we take a crucial step toward addressing this problem: we provide an analysis of a two-layer regression problem. In contrast to previous works, our first layer is activated by a softmax unit. This sets the stage for future analyses of creating more activation functions based on the softmax function. Rearranging the softmax function leads to significantly different analyses. Our main results involve analyzing the convergence properties of an approximate Newton method used to minimize the regularized training loss. We prove that the loss function for the Hessian matrix is positive definite and Lipschitz continuous under certain assumptions. This enables us to establish local convergence guarantees for the proposed training algorithm. Specifically, with an appropriate initialization and after O(log(1/ϵ))O(\log(1/\epsilon)) iterations, our algorithm can find an ϵ\epsilon-approximate minimizer of the training loss with high probability. Each iteration requires approximately O(nnz(C)+dω)O(\mathrm{nnz}(C) + d^\omega) time, where dd is the model size, CC is the input matrix, and ω<2.374\omega < 2.374 is the matrix multiplication exponent

    Serial Dependence in Dermatological Judgments

    Get PDF
    This research was funded by the National Institutes of Health (NIH) grant number R01CA236793.Peer reviewedPublisher PD

    Fooling Polarization-based Vision using Locally Controllable Polarizing Projection

    Full text link
    Polarization is a fundamental property of light that encodes abundant information regarding surface shape, material, illumination and viewing geometry. The computer vision community has witnessed a blossom of polarization-based vision applications, such as reflection removal, shape-from-polarization, transparent object segmentation and color constancy, partially due to the emergence of single-chip mono/color polarization sensors that make polarization data acquisition easier than ever. However, is polarization-based vision vulnerable to adversarial attacks? If so, is that possible to realize these adversarial attacks in the physical world, without being perceived by human eyes? In this paper, we warn the community of the vulnerability of polarization-based vision, which can be more serious than RGB-based vision. By adapting a commercial LCD projector, we achieve locally controllable polarizing projection, which is successfully utilized to fool state-of-the-art polarization-based vision algorithms for glass segmentation and color constancy. Compared with existing physical attacks on RGB-based vision, which always suffer from the trade-off between attack efficacy and eye conceivability, the adversarial attackers based on polarizing projection are contact-free and visually imperceptible, since naked human eyes can rarely perceive the difference of viciously manipulated polarizing light and ordinary illumination. This poses unprecedented risks on polarization-based vision, both in the monochromatic and trichromatic domain, for which due attentions should be paid and counter measures be considered
    corecore