203 research outputs found
Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
The convergence of GD and SGD when training mildly parameterized neural
networks starting from random initialization is studied. For a broad range of
models and loss functions, including the most commonly used square loss and
cross entropy loss, we prove an ``early stage convergence'' result. We show
that the loss is decreased by a significant amount in the early stage of the
training, and this decrease is fast. Furthurmore, for exponential type loss
functions, and under some assumptions on the training data, we show global
convergence of GD. Instead of relying on extreme over-parameterization, our
study is based on a microscopic analysis of the activation patterns for the
neurons, which helps us derive more powerful lower bounds for the gradient. The
results on activation patterns, which we call ``neuron partition'', help build
intuitions for understanding the behavior of neural networks' training
dynamics, and may be of independent interest
Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks
The training process of ReLU neural networks often exhibits complicated
nonlinear phenomena. The nonlinearity of models and non-convexity of loss pose
significant challenges for theoretical analysis. Therefore, most previous
theoretical works on the optimization dynamics of neural networks focus either
on local analysis (like the end of training) or approximate linear models (like
Neural Tangent Kernel). In this work, we conduct a complete theoretical
characterization of the training process of a two-layer ReLU network trained by
Gradient Flow on a linearly separable data. In this specific setting, our
analysis captures the whole optimization process starting from random
initialization to final convergence. Despite the relatively simple model and
data that we studied, we reveal four different phases from the whole training
process showing a general simplifying-to-complicating learning trend. Specific
nonlinear behaviors can also be precisely identified and captured
theoretically, such as initial condensation, saddle-to-plateau dynamics,
plateau escape, changes of activation patterns, learning with increasing
complexity, etc.Comment: 88 page
When does SGD favor flat minima? A quantitative characterization via linear stability
The observation that stochastic gradient descent (SGD) favors flat minima has
played a fundamental role in understanding implicit regularization of SGD and
guiding the tuning of hyperparameters. In this paper, we provide a quantitative
explanation of this striking phenomenon by relating the particular noise
structure of SGD to its \emph{linear stability} (Wu et al., 2018).
Specifically, we consider training over-parameterized models with square loss.
We prove that if a global minimum is linearly stable for SGD, then
it must satisfy , where
denote the Frobenius norm of Hessian at ,
batch size, and learning rate, respectively. Otherwise, SGD will escape from
that minimum \emph{exponentially} fast. Hence, for minima accessible to SGD,
the flatness -- as measured by the Frobenius norm of the Hessian -- is bounded
independently of the model size and sample size. The key to obtaining these
results is exploiting the particular geometry awareness of SGD noise: 1) the
noise magnitude is proportional to loss value; 2) the noise directions
concentrate in the sharp directions of local landscape. This property of SGD
noise provably holds for linear networks and random feature models (RFMs) and
is empirically verified for nonlinear networks. Moreover, the validity and
practical relevance of our theoretical findings are justified by extensive
numerical experiments
Understanding User Engagement in Online Communities during COVID-19 Pandemic: Evidence from Sentiment and Semantic Analysis on YouTube
Since the outbreak of COVID-19, the pandemic has changed the lives of many people and brought dramatic motional experiences. Among many social media platforms, YouTube saw the most significant growth of any social media app among American users during the pandemic, according to the Pew Research Center on 7th April 2021. Exposure to COVID-19 related news can have a significant impact on user engagement on social networks. Different news may trigger different emotions (i.e., anger, anticipation, disgust, fear, joy, sadness, surprise, or trust), and a user may engage differently in response to the news. On YouTube, user engagement is manifested through actions such as liking, disliking, commenting, or sharing videos. During the pandemic, many users provide constructive comments that are encouraging, respectful, and informative to support each other. We applied sentiment analysis in the study to investigate different emotions and applied semantic analysis to investigate positive appraisal (i.e., encouraging, respectful, and informative) to identify salient factors that can motivate user engagement. The findings of the work shed light on how social network platforms could encourage constructive comments to help people provide emotional support to each other during pandemics through using positive appraisal in online news comments. The first research objective is to study the impact of sentiment valence of different emotions on people’s liking of news comments. News about COVID-19 on social networks may provide valuable information but also bring about public panic. In response to this COVID-19 related news, reviewers expressed their feelings by clicking the like, dislike buttons to the video and comments, or writing some comments under the video on YouTube. Some positive news was followed by comments expressing their anticipation, joy, and trust, while negative news might trigger sadness, fear, disgust, or anger. Our research focuses on sentiment analysis of news titles and the comments following each video. News title provides important information about the video, showing the summary of the video and allowing people to get a first glimpse of the content of the video. Through sentiment analysis of title and comments, correlations could be found between title/comments sentiment and user engagement. The second research objective is to investigate the impact of comments’ positive appraisal (i.e., encouraging, respectful, and informative content) on user engagement. The informative comments under the negative news have significant implications for the audience. They can be considered as a complement or judgment of the video content. Encouraging and respectful comments also help people build good conversations online. Our research focuses on semantic analysis of news titles and comments based on the three dimensions of positive appraisal and analyzes their impacts on user engagement to like the corresponding comment. We discuss the correlation between video title sentiment and the positive appraisal followed in the comments of the video to provide good conversations on the platform. A group of 38,085 online comments was collected from more than 400 different publishers from January 1st to January 30th, 2021, on YouTube. The dataset contains the most-viewed videos that were related to at least one of the following search queries: coronavirus, COVID-19, pandemic, or vaccine. NRC lexicon is adopted in the sentiment analysis to identify different emotions in titles and comments of the video. We adopt the topic modeling method and build a classifier from the Yahoo News Annotated Comments Corpus to identify constructive online comments for specific topics. We also measure inter-annotator agreements and compare the reliability of manual annotation and the classifier. We find that longer titles and sad emotions can obtain more likes on the comments of COVID-19 related news. During the pandemic, people tend to show their support when they find others are quite sad. We also expect to see correlations between some positive appraisals and user engagement
AuE-IPA: An AU Engagement Based Infant Pain Assessment Method
Recent studies have found that pain in infancy has a significant impact on
infant development, including psychological problems, possible brain injury,
and pain sensitivity in adulthood. However, due to the lack of specialists and
the fact that infants are unable to express verbally their experience of pain,
it is difficult to assess infant pain. Most existing infant pain assessment
systems directly apply adult methods to infants ignoring the differences
between infant expressions and adult expressions. Meanwhile, as the study of
facial action coding system continues to advance, the use of action units (AUs)
opens up new possibilities for expression recognition and pain assessment. In
this paper, a novel AuE-IPA method is proposed for assessing infant pain by
leveraging different engagement levels of AUs. First, different engagement
levels of AUs in infant pain are revealed, by analyzing the class activation
map of an end-to-end pain assessment model. The intensities of top-engaged AUs
are then used in a regression model for achieving automatic infant pain
assessment. The model proposed is trained and experimented on YouTube
Immunization dataset, YouTube Blood Test dataset, and iCOPEVid dataset. The
experimental results show that our AuE-IPA method is more applicable to infants
and possesses stronger generalization ability than end-to-end assessment model
and the classic PSPI metric
Domain Adaptive Person Search via GAN-based Scene Synthesis for Cross-scene Videos
Person search has recently been a challenging task in the computer vision
domain, which aims to search specific pedestrians from real
cameras.Nevertheless, most surveillance videos comprise only a handful of
images of each pedestrian, which often feature identical backgrounds and
clothing. Hence, it is difficult to learn more discriminative features for
person search in real scenes. To tackle this challenge, we draw on Generative
Adversarial Networks (GAN) to synthesize data from surveillance videos. GAN has
thrived in computer vision problems because it produces high-quality images
efficiently. We merely alter the popular Fast R-CNN model, which is capable of
processing videos and yielding accurate detection outcomes. In order to
appropriately relieve the pressure brought by the two-stage model, we design an
Assisted-Identity Query Module (AIDQ) to provide positive images for the behind
part. Besides, the proposed novel GAN-based Scene Synthesis model that can
synthesize high-quality cross-id person images for person search tasks. In
order to facilitate the feature learning of the GAN-based Scene Synthesis
model, we adopt an online learning strategy that collaboratively learns the
synthesized images and original images. Extensive experiments on two widely
used person search benchmarks, CUHK-SYSU and PRW, have shown that our method
has achieved great performance, and the extensive ablation study further
justifies our GAN-synthetic data can effectively increase the variability of
the datasets and be more realistic
The relation between the rheological properties of gels and the mechanical properties of their corresponding aerogels
A series of low density, highly porous clay/poly(vinyl alcohol) composite aerogels, incorporating ammonium alginate, were fabricated via a convenient and eco-friendly freeze drying method. It is significant to understand rheological properties of precursor gels because they directly affect the form of aerogels and their processing behaviors. The introduction of ammonium alginate impacted the rheological properties of colloidal gels and improved the mechanical performance of the subject aerogels. The specific compositions and processing conditions applied to those colloidal gel systems brought about different aerogel morphologies, which in turn translated into the observed mechanical properties. The bridge between gel rheologies and aerogel structures are established in the present workPostprint (published version
An Explicit Method for Fast Monocular Depth Recovery in Corridor Environments
Monocular cameras are extensively employed in indoor robotics, but their
performance is limited in visual odometry, depth estimation, and related
applications due to the absence of scale information.Depth estimation refers to
the process of estimating a dense depth map from the corresponding input image,
existing researchers mostly address this issue through deep learning-based
approaches, yet their inference speed is slow, leading to poor real-time
capabilities. To tackle this challenge, we propose an explicit method for rapid
monocular depth recovery specifically designed for corridor environments,
leveraging the principles of nonlinear optimization. We adopt the virtual
camera assumption to make full use of the prior geometric features of the
scene. The depth estimation problem is transformed into an optimization problem
by minimizing the geometric residual. Furthermore, a novel depth plane
construction technique is introduced to categorize spatial points based on
their possible depths, facilitating swift depth estimation in enclosed
structural scenarios, such as corridors. We also propose a new corridor
dataset, named Corr\_EH\_z, which contains images as captured by the UGV camera
of a variety of corridors. An exhaustive set of experiments in different
corridors reveal the efficacy of the proposed algorithm.Comment: 10 pages, 8 figures. arXiv admin note: text overlap with
arXiv:2111.08600 by other author
- …