203 research outputs found

    Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks

    Full text link
    The convergence of GD and SGD when training mildly parameterized neural networks starting from random initialization is studied. For a broad range of models and loss functions, including the most commonly used square loss and cross entropy loss, we prove an ``early stage convergence'' result. We show that the loss is decreased by a significant amount in the early stage of the training, and this decrease is fast. Furthurmore, for exponential type loss functions, and under some assumptions on the training data, we show global convergence of GD. Instead of relying on extreme over-parameterization, our study is based on a microscopic analysis of the activation patterns for the neurons, which helps us derive more powerful lower bounds for the gradient. The results on activation patterns, which we call ``neuron partition'', help build intuitions for understanding the behavior of neural networks' training dynamics, and may be of independent interest

    Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks

    Full text link
    The training process of ReLU neural networks often exhibits complicated nonlinear phenomena. The nonlinearity of models and non-convexity of loss pose significant challenges for theoretical analysis. Therefore, most previous theoretical works on the optimization dynamics of neural networks focus either on local analysis (like the end of training) or approximate linear models (like Neural Tangent Kernel). In this work, we conduct a complete theoretical characterization of the training process of a two-layer ReLU network trained by Gradient Flow on a linearly separable data. In this specific setting, our analysis captures the whole optimization process starting from random initialization to final convergence. Despite the relatively simple model and data that we studied, we reveal four different phases from the whole training process showing a general simplifying-to-complicating learning trend. Specific nonlinear behaviors can also be precisely identified and captured theoretically, such as initial condensation, saddle-to-plateau dynamics, plateau escape, changes of activation patterns, learning with increasing complexity, etc.Comment: 88 page

    When does SGD favor flat minima? A quantitative characterization via linear stability

    Full text link
    The observation that stochastic gradient descent (SGD) favors flat minima has played a fundamental role in understanding implicit regularization of SGD and guiding the tuning of hyperparameters. In this paper, we provide a quantitative explanation of this striking phenomenon by relating the particular noise structure of SGD to its \emph{linear stability} (Wu et al., 2018). Specifically, we consider training over-parameterized models with square loss. We prove that if a global minimum θ∗\theta^* is linearly stable for SGD, then it must satisfy ∥H(θ∗)∥F≤O(B/η)\|H(\theta^*)\|_F\leq O(\sqrt{B}/\eta), where ∥H(θ∗)∥F,B,η\|H(\theta^*)\|_F, B,\eta denote the Frobenius norm of Hessian at θ∗\theta^*, batch size, and learning rate, respectively. Otherwise, SGD will escape from that minimum \emph{exponentially} fast. Hence, for minima accessible to SGD, the flatness -- as measured by the Frobenius norm of the Hessian -- is bounded independently of the model size and sample size. The key to obtaining these results is exploiting the particular geometry awareness of SGD noise: 1) the noise magnitude is proportional to loss value; 2) the noise directions concentrate in the sharp directions of local landscape. This property of SGD noise provably holds for linear networks and random feature models (RFMs) and is empirically verified for nonlinear networks. Moreover, the validity and practical relevance of our theoretical findings are justified by extensive numerical experiments

    Understanding User Engagement in Online Communities during COVID-19 Pandemic: Evidence from Sentiment and Semantic Analysis on YouTube

    Get PDF
    Since the outbreak of COVID-19, the pandemic has changed the lives of many people and brought dramatic motional experiences. Among many social media platforms, YouTube saw the most significant growth of any social media app among American users during the pandemic, according to the Pew Research Center on 7th April 2021. Exposure to COVID-19 related news can have a significant impact on user engagement on social networks. Different news may trigger different emotions (i.e., anger, anticipation, disgust, fear, joy, sadness, surprise, or trust), and a user may engage differently in response to the news. On YouTube, user engagement is manifested through actions such as liking, disliking, commenting, or sharing videos. During the pandemic, many users provide constructive comments that are encouraging, respectful, and informative to support each other. We applied sentiment analysis in the study to investigate different emotions and applied semantic analysis to investigate positive appraisal (i.e., encouraging, respectful, and informative) to identify salient factors that can motivate user engagement. The findings of the work shed light on how social network platforms could encourage constructive comments to help people provide emotional support to each other during pandemics through using positive appraisal in online news comments. The first research objective is to study the impact of sentiment valence of different emotions on people’s liking of news comments. News about COVID-19 on social networks may provide valuable information but also bring about public panic. In response to this COVID-19 related news, reviewers expressed their feelings by clicking the like, dislike buttons to the video and comments, or writing some comments under the video on YouTube. Some positive news was followed by comments expressing their anticipation, joy, and trust, while negative news might trigger sadness, fear, disgust, or anger. Our research focuses on sentiment analysis of news titles and the comments following each video. News title provides important information about the video, showing the summary of the video and allowing people to get a first glimpse of the content of the video. Through sentiment analysis of title and comments, correlations could be found between title/comments sentiment and user engagement. The second research objective is to investigate the impact of comments’ positive appraisal (i.e., encouraging, respectful, and informative content) on user engagement. The informative comments under the negative news have significant implications for the audience. They can be considered as a complement or judgment of the video content. Encouraging and respectful comments also help people build good conversations online. Our research focuses on semantic analysis of news titles and comments based on the three dimensions of positive appraisal and analyzes their impacts on user engagement to like the corresponding comment. We discuss the correlation between video title sentiment and the positive appraisal followed in the comments of the video to provide good conversations on the platform. A group of 38,085 online comments was collected from more than 400 different publishers from January 1st to January 30th, 2021, on YouTube. The dataset contains the most-viewed videos that were related to at least one of the following search queries: coronavirus, COVID-19, pandemic, or vaccine. NRC lexicon is adopted in the sentiment analysis to identify different emotions in titles and comments of the video. We adopt the topic modeling method and build a classifier from the Yahoo News Annotated Comments Corpus to identify constructive online comments for specific topics. We also measure inter-annotator agreements and compare the reliability of manual annotation and the classifier. We find that longer titles and sad emotions can obtain more likes on the comments of COVID-19 related news. During the pandemic, people tend to show their support when they find others are quite sad. We also expect to see correlations between some positive appraisals and user engagement

    AuE-IPA: An AU Engagement Based Infant Pain Assessment Method

    Full text link
    Recent studies have found that pain in infancy has a significant impact on infant development, including psychological problems, possible brain injury, and pain sensitivity in adulthood. However, due to the lack of specialists and the fact that infants are unable to express verbally their experience of pain, it is difficult to assess infant pain. Most existing infant pain assessment systems directly apply adult methods to infants ignoring the differences between infant expressions and adult expressions. Meanwhile, as the study of facial action coding system continues to advance, the use of action units (AUs) opens up new possibilities for expression recognition and pain assessment. In this paper, a novel AuE-IPA method is proposed for assessing infant pain by leveraging different engagement levels of AUs. First, different engagement levels of AUs in infant pain are revealed, by analyzing the class activation map of an end-to-end pain assessment model. The intensities of top-engaged AUs are then used in a regression model for achieving automatic infant pain assessment. The model proposed is trained and experimented on YouTube Immunization dataset, YouTube Blood Test dataset, and iCOPEVid dataset. The experimental results show that our AuE-IPA method is more applicable to infants and possesses stronger generalization ability than end-to-end assessment model and the classic PSPI metric

    Domain Adaptive Person Search via GAN-based Scene Synthesis for Cross-scene Videos

    Full text link
    Person search has recently been a challenging task in the computer vision domain, which aims to search specific pedestrians from real cameras.Nevertheless, most surveillance videos comprise only a handful of images of each pedestrian, which often feature identical backgrounds and clothing. Hence, it is difficult to learn more discriminative features for person search in real scenes. To tackle this challenge, we draw on Generative Adversarial Networks (GAN) to synthesize data from surveillance videos. GAN has thrived in computer vision problems because it produces high-quality images efficiently. We merely alter the popular Fast R-CNN model, which is capable of processing videos and yielding accurate detection outcomes. In order to appropriately relieve the pressure brought by the two-stage model, we design an Assisted-Identity Query Module (AIDQ) to provide positive images for the behind part. Besides, the proposed novel GAN-based Scene Synthesis model that can synthesize high-quality cross-id person images for person search tasks. In order to facilitate the feature learning of the GAN-based Scene Synthesis model, we adopt an online learning strategy that collaboratively learns the synthesized images and original images. Extensive experiments on two widely used person search benchmarks, CUHK-SYSU and PRW, have shown that our method has achieved great performance, and the extensive ablation study further justifies our GAN-synthetic data can effectively increase the variability of the datasets and be more realistic

    The relation between the rheological properties of gels and the mechanical properties of their corresponding aerogels

    Get PDF
    A series of low density, highly porous clay/poly(vinyl alcohol) composite aerogels, incorporating ammonium alginate, were fabricated via a convenient and eco-friendly freeze drying method. It is significant to understand rheological properties of precursor gels because they directly affect the form of aerogels and their processing behaviors. The introduction of ammonium alginate impacted the rheological properties of colloidal gels and improved the mechanical performance of the subject aerogels. The specific compositions and processing conditions applied to those colloidal gel systems brought about different aerogel morphologies, which in turn translated into the observed mechanical properties. The bridge between gel rheologies and aerogel structures are established in the present workPostprint (published version

    An Explicit Method for Fast Monocular Depth Recovery in Corridor Environments

    Full text link
    Monocular cameras are extensively employed in indoor robotics, but their performance is limited in visual odometry, depth estimation, and related applications due to the absence of scale information.Depth estimation refers to the process of estimating a dense depth map from the corresponding input image, existing researchers mostly address this issue through deep learning-based approaches, yet their inference speed is slow, leading to poor real-time capabilities. To tackle this challenge, we propose an explicit method for rapid monocular depth recovery specifically designed for corridor environments, leveraging the principles of nonlinear optimization. We adopt the virtual camera assumption to make full use of the prior geometric features of the scene. The depth estimation problem is transformed into an optimization problem by minimizing the geometric residual. Furthermore, a novel depth plane construction technique is introduced to categorize spatial points based on their possible depths, facilitating swift depth estimation in enclosed structural scenarios, such as corridors. We also propose a new corridor dataset, named Corr\_EH\_z, which contains images as captured by the UGV camera of a variety of corridors. An exhaustive set of experiments in different corridors reveal the efficacy of the proposed algorithm.Comment: 10 pages, 8 figures. arXiv admin note: text overlap with arXiv:2111.08600 by other author
    • …
    corecore