37 research outputs found
Solving Regularized Exp, Cosh and Sinh Regression Problems
In modern machine learning, attention computation is a fundamental task for
training large language models such as Transformer, GPT-4 and ChatGPT. In this
work, we study exponential regression problem which is inspired by the
softmax/exp unit in the attention mechanism in large language models. The
standard exponential regression is non-convex. We study the regularization
version of exponential regression problem which is a convex problem. We use
approximate newton method to solve in input sparsity time.
Formally, in this problem, one is given matrix , , and any of functions and denoted as . The goal is to find the optimal that
minimize . The
straightforward method is to use the naive Newton's method. Let
denote the number of non-zeros entries in matrix . Let
denote the exponent of matrix multiplication. Currently, . Let denote the accuracy error. In this paper, we
make use of the input sparsity and purpose an algorithm that use iterations and per iteration time to solve the problem
Attention Scheme Inspired Softmax Regression
Large language models (LLMs) have made transformed changes for human society.
One of the key computation in LLMs is the softmax unit. This operation is
important in LLMs because it allows the model to generate a distribution over
possible next words or phrases, given a sequence of input words. This
distribution is then used to select the most likely next word or phrase, based
on the probabilities assigned by the model. The softmax unit plays a crucial
role in training LLMs, as it allows the model to learn from the data by
adjusting the weights and biases of the neural network.
In the area of convex optimization such as using central path method to solve
linear programming. The softmax function has been used a crucial tool for
controlling the progress and stability of potential function [Cohen, Lee and
Song STOC 2019, Brand SODA 2020].
In this work, inspired the softmax unit, we define a softmax regression
problem. Formally speaking, given a matrix and
a vector , the goal is to use greedy type algorithm to
solve \begin{align*} \min_{x} \| \langle \exp(Ax), {\bf 1}_n \rangle^{-1}
\exp(Ax) - b \|_2^2. \end{align*} In certain sense, our provable convergence
result provides theoretical support for why we can use greedy algorithm to
train softmax function in practice
Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression
There have been significant advancements made by large language models (LLMs)
in various aspects of our daily lives. LLMs serve as a transformative force in
natural language processing, finding applications in text generation,
translation, sentiment analysis, and question-answering. The accomplishments of
LLMs have led to a substantial increase in research efforts in this domain. One
specific two-layer regression problem has been well-studied in prior works,
where the first layer is activated by a ReLU unit, and the second layer is
activated by a softmax unit. While previous works provide a solid analysis of
building a two-layer regression, there is still a gap in the analysis of
constructing regression problems with more than two layers.
In this paper, we take a crucial step toward addressing this problem: we
provide an analysis of a two-layer regression problem. In contrast to previous
works, our first layer is activated by a softmax unit. This sets the stage for
future analyses of creating more activation functions based on the softmax
function. Rearranging the softmax function leads to significantly different
analyses. Our main results involve analyzing the convergence properties of an
approximate Newton method used to minimize the regularized training loss. We
prove that the loss function for the Hessian matrix is positive definite and
Lipschitz continuous under certain assumptions. This enables us to establish
local convergence guarantees for the proposed training algorithm. Specifically,
with an appropriate initialization and after iterations,
our algorithm can find an -approximate minimizer of the training loss
with high probability. Each iteration requires approximately time, where is the model size, is the input matrix, and
is the matrix multiplication exponent
Plant Phenotyping on Mobile Devices
Plants phenotyping is a fast and non-destructive method to obtain the physiological features of plants, compared with the expensive and time costing chemical analysis with plant sampling. Through plant phenotyping, scientists and farmers can tell plant health status more accurately compared to visual inspection, thus avoid the waste in time and resources and even to predict the productivity. However, the size and price of current plant phenotyping equipment restrict them from being widely applied at a farmer’s household level. Everyday field operation is barely achieved because of the availability of easy-to-carry and cost-effective equipment such as hyper-spectrum cameras, infrared cameras and thermal cameras. A plant phenotyping tool on mobile devices will make plant phenotyping technology more accessible to ordinary farmers and researchers. This application incorporates the use of physical optics, plant science models, and image processing ability of smartphones. With our special optical design, multispectral instead of RGB (red, green and blue) images can be obtained from the smartphones with fairly low cost. Through quick image processing on the smartphones, the APP will provide accurate plant physiological features predictions such as water, chlorophyll, and nitrogen. The sophisticated prediction models are applied which are provided by the Purdue’s plant phenotyping team. Once widely adopted, the information collected by the smartphones with the developed APP will be sent back to Purdue’s plant health big-data database. The feedback will not only allow us to improve our models, but also provide farmers and agricultural researchers easy access to real-time crop plant health data
The Next-Gen Crop Nutrient Stress Identification with High-Precision Sensing Technology in Digital Agriculture
Crop yields are facing significant losses from nutrient deficiencies. Over-fertilizing also has negative economic and environmental impacts. It is challenging to optimize fertilizing without an accurate diagnosis. Recently, plant phenotyping has demonstrated outstanding capabilities in estimating crop traits. As one of the leading technologies, LeafSpec, provides high-quality crop image data for improving phenotyping quality. In this study, novel algorithms are developed for LeafSpec to identify crop nutrient deficiencies more accurately. Combined with UAV system, this technology will bring growers a robust solution for fertilizing diagnosis and scientific crop management
Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Dynamic computation has emerged as a promising avenue to enhance the
inference efficiency of deep networks. It allows selective activation of
computational units, leading to a reduction in unnecessary computations for
each input sample. However, the actual efficiency of these dynamic models can
deviate from theoretical predictions. This mismatch arises from: 1) the lack of
a unified approach due to fragmented research; 2) the focus on algorithm design
over critical scheduling strategies, especially in CUDA-enabled GPU contexts;
and 3) challenges in measuring practical latency, given that most libraries
cater to static operations. Addressing these issues, we unveil the
Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates
three primary dynamic paradigms-spatially adaptive computation, dynamic layer
skipping, and dynamic channel skipping. To bridge the theoretical and practical
efficiency gap, LAUDNet merges algorithmic design with scheduling optimization,
guided by a latency predictor that accurately gauges dynamic operator latency.
We've tested LAUDNet across multiple vision tasks, demonstrating its capacity
to notably reduce the latency of models like ResNet-101 by over 50% on
platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in
balancing accuracy and efficiency. Code is available at:
https://www.github.com/LeapLabTHU/LAUDNet
Progress and Opportunities of Foundation Models in Bioinformatics
Bioinformatics has witnessed a paradigm shift with the increasing integration
of artificial intelligence (AI), particularly through the adoption of
foundation models (FMs). These AI techniques have rapidly advanced, addressing
historical challenges in bioinformatics such as the scarcity of annotated data
and the presence of data noise. FMs are particularly adept at handling
large-scale, unlabeled data, a common scenario in biological contexts due to
the time-consuming and costly nature of experimentally determining labeled
data. This characteristic has allowed FMs to excel and achieve notable results
in various downstream validation tasks, demonstrating their ability to
represent diverse biological entities effectively. Undoubtedly, FMs have
ushered in a new era in computational biology, especially in the realm of deep
learning. The primary goal of this survey is to conduct a systematic
investigation and summary of FMs in bioinformatics, tracing their evolution,
current research status, and the methodologies employed. Central to our focus
is the application of FMs to specific biological problems, aiming to guide the
research community in choosing appropriate FMs for their research needs. We
delve into the specifics of the problem at hand including sequence analysis,
structure prediction, function annotation, and multimodal integration,
comparing the structures and advancements against traditional methods.
Furthermore, the review analyses challenges and limitations faced by FMs in
biology, such as data noise, model explainability, and potential biases.
Finally, we outline potential development paths and strategies for FMs in
future biological research, setting the stage for continued innovation and
application in this rapidly evolving field. This comprehensive review serves
not only as an academic resource but also as a roadmap for future explorations
and applications of FMs in biology.Comment: 27 pages, 3 figures, 2 table
Synthetic Datasets for Autonomous Driving: A Survey
Autonomous driving techniques have been flourishing in recent years while
thirsting for huge amounts of high-quality data. However, it is difficult for
real-world datasets to keep up with the pace of changing requirements due to
their expensive and time-consuming experimental and labeling costs. Therefore,
more and more researchers are turning to synthetic datasets to easily generate
rich and changeable data as an effective complement to the real world and to
improve the performance of algorithms. In this paper, we summarize the
evolution of synthetic dataset generation methods and review the work to date
in synthetic datasets related to single and multi-task categories for to
autonomous driving study. We also discuss the role that synthetic dataset plays
the evaluation, gap test, and positive effect in autonomous driving related
algorithm testing, especially on trustworthiness and safety aspects. Finally,
we discuss general trends and possible development directions. To the best of
our knowledge, this is the first survey focusing on the application of
synthetic datasets in autonomous driving. This survey also raises awareness of
the problems of real-world deployment of autonomous driving technology and
provides researchers with a possible solution.Comment: 19 pages, 5 figure
Prevalence and trend of hepatitis C virus infection among blood donors in Chinese mainland: a systematic review and meta-analysis
<p>Abstract</p> <p>Background</p> <p>Blood transfusion is one of the most common transmission pathways of hepatitis C virus (HCV). This paper aims to provide a comprehensive and reliable tabulation of available data on the epidemiological characteristics and risk factors for HCV infection among blood donors in Chinese mainland, so as to help make prevention strategies and guide further research.</p> <p>Methods</p> <p>A systematic review was constructed based on the computerized literature database. Infection rates and 95% confidence intervals (95% CI) were calculated using the approximate normal distribution model. Odds ratios and 95% CI were calculated by fixed or random effects models. Data manipulation and statistical analyses were performed using STATA 10.0 and ArcGIS 9.3 was used for map construction.</p> <p>Results</p> <p>Two hundred and sixty-five studies met our inclusion criteria. The pooled prevalence of HCV infection among blood donors in Chinese mainland was 8.68% (95% CI: 8.01%-9.39%), and the epidemic was severer in North and Central China, especially in Henan and Hebei. While a significant lower rate was found in Yunnan. Notably, before 1998 the pooled prevalence of HCV infection was 12.87% (95%CI: 11.25%-14.56%) among blood donors, but decreased to 1.71% (95%CI: 1.43%-1.99%) after 1998. No significant difference was found in HCV infection rates between male and female blood donors, or among different blood type donors. The prevalence of HCV infection was found to increase with age. During 1994-1995, the prevalence rate reached the highest with a percentage of 15.78% (95%CI: 12.21%-19.75%), and showed a decreasing trend in the following years. A significant difference was found among groups with different blood donation types, Plasma donors had a relatively higher prevalence than whole blood donors of HCV infection (33.95% <it>vs </it>7.9%).</p> <p>Conclusions</p> <p>The prevalence of HCV infection has rapidly decreased since 1998 and kept a low level in recent years, but some provinces showed relatively higher prevalence than the general population. It is urgent to make efficient measures to prevent HCV secondary transmission and control chronic progress, and the key to reduce the HCV incidence among blood donors is to encourage true voluntary blood donors, strictly implement blood donation law, and avoid cross-infection.</p
Misiroot: A Robotic Minimum Invasion in Situ Imaging System for Plant Root Phenotyping
Plant root phenotyping technologies play an important role in breeding, plant protection, and other plant science research projects. The root phenotyping customers urgently need technologies that are low-cost, in situ, non-destructive to the roots, and suitable for the natural soil environment. Many recently developed root phenotyping methods such as minirhizotron, CT, and MRI scanners have their unique advantages in observing plant roots, but they also have disadvantages and cannot meet all the critical requirements simultaneously. The study in this paper focuses on the development of a new plant root phenotyping robot that is minimally invasive to plants and working in situ inside natural soil, called “MISIRoot”. The MISIRoot system (patent pending) mainly consists of an industriallevel robotic arm, a mini-size camera with lighting set, a plant pot holding platform, and the image processing software for root recognition and feature extraction. MISIRoot can take high-resolution color images of the roots in soil with minimal disturbance to the root and reconstruct the plant roots’ three-dimensional (3D) structure at an accuracy of 0.1 mm. In a test assay, well-watered and drought-stressed groups of corn plants were measured by MISIRoot at V3, V4, and V5 stages. The system successfully acquired the RGB color images of the roots and extracted the 3D points cloud data which showed the locations of the detected roots in the soil. The plants measured by MISIRoot and plants not measured (controls) were carefully compared with Purdue’s Lilly 13-4 Hyperspectral Imaging Facility (reference). No significant differences were found between the two groups of plants at different growth stages. Therefore, it was concluded that MISIRoot measurements had no significant disturbance to the corn plant’s growth