256 research outputs found
Law of Balance and Stationary Distribution of Stochastic Gradient Descent
The stochastic gradient descent (SGD) algorithm is the algorithm we use to
train neural networks. However, it remains poorly understood how the SGD
navigates the highly nonlinear and degenerate loss landscape of a neural
network. In this work, we prove that the minibatch noise of SGD regularizes the
solution towards a balanced solution whenever the loss function contains a
rescaling symmetry. Because the difference between a simple diffusion process
and SGD dynamics is the most significant when symmetries are present, our
theory implies that the loss function symmetries constitute an essential probe
of how SGD works. We then apply this result to derive the stationary
distribution of stochastic gradient flow for a diagonal linear network with
arbitrary depth and width. The stationary distribution exhibits complicated
nonlinear phenomena such as phase transitions, broken ergodicity, and
fluctuation inversion. These phenomena are shown to exist uniquely in deep
networks, implying a fundamental difference between deep and shallow models.Comment: Preprin
The Attenuating Effect of Intelligent Agents and Agent Autonomy on Managers’ Ability to Diffuse Responsibility for and Engage in Earnings Management
Advances in IT suggest that computerized intelligent agents (IAs) may soon occupy many roles that presently employ human agents. A significant concern is the ethical conduct of those who use IAs, including their possible utilization by managers to engage in earnings management. Following economics and moral disengagement theory, we investigate how financial reporting decisions are affected when they are supported by the work of an IA versus a human agent, with varying autonomy. In a 2 x 2 between-participants experiment with experienced managers, we manipulate agent type and autonomy, finding that managers engage in less aggressive financial reporting decisions with IAs than with human agents, and engage in less aggressive reporting decisions with less autonomous agents than with more autonomous agents. Path analysis suggests that managers’ perception of control over their agent and ability to diffuse responsibility for their financial reporting decisions serially mediate the effect of agent type and autonomy on managers’ financial reporting decisions. Our results have implications for regulators and practitioners, where the adoption of computerized intelligent agents can attenuate managers’ earnings management activity by preventing them from diffusing responsibility for their actions to others
Recommended from our members
The Impact of Counter-Rumor Strategy and Source on Non-Professional Investors' Judgments over Social Media
Non-professional investors often rely on information obtained from social media to make investment decisions. Extant literature has not examined the most effective strategy for the target company to counter the rumors so that investors will be more willing to continue investing in the target firm. Drawing on source credibility theory and the moral intensity model, I propose that the most effective strategy would vary given different agents who are selected to counter the rumor. After conducting a 2 x 3 (counter-rumor source x counter-rumor strategy) experiment with 272 non-professional investors recruited from Amazon Mechanical Turk, my study shows that when an internal agent (e.g., the CEO) acts as a counter-rumor source, shareholders are more willing to invest in the company when the internal agent utilizes a denial strategy rather than a reassociation or a questioning strategy. In contrast, when an external agent (e.g., a famous food blogger) serves as the counter-rumor source, the external agent can also use a questioning strategy in addition to a denial strategy to motivate shareholders to be more willing to invest in the company; however, the external agent still needs to avoid from engaging a reassociation strategy. Moderated serial-mediation analysis shows that the persuasiveness of the counter-rumor information and investors' perceived rumor intensity serially mediate the effect of counter-rumor source on investors' willingness to invest, and this effect is conditioned on the different strategy used to counter the rumor. Overall, the main effect of counter-rumor source suggests that external agents are perceived as more persuasive, which leads investors to perceive less rumor intensity, making them more willing to invest in the target company. The results of my paper can thus inform companies' social media policy
SGD with a Constant Large Learning Rate Can Converge to Local Maxima
Previous works on stochastic gradient descent (SGD) often focus on its
success. In this work, we construct worst-case optimization problems
illustrating that, when not in the regimes that the previous works often
assume, SGD can exhibit many strange and potentially undesirable behaviors.
Specifically, we construct landscapes and data distributions such that (1) SGD
converges to local maxima, (2) SGD escapes saddle points arbitrarily slowly,
(3) SGD prefers sharp minima over flat ones, and (4) AMSGrad converges to local
maxima. We also realize results in a minimal neural network-like example. Our
results highlight the importance of simultaneously analyzing the minibatch
sampling, discrete-time updates rules, and realistic landscapes to understand
the role of SGD in deep learning.Comment: ICLR 2022 Spotligh
An Essential Farnesylated Kinesin in Trypanosoma brucei
Kinesins are a family of motor proteins conserved throughout eukaryotes. In our present study we characterize a novel kinesin, KinesinCaaX, orthologs of which are only found in the kinetoplastids and not other eukaryotes. KinesinCaaX has the CVIM amino acids at the C-terminus, and CVIM was previously shown to be an ideal signal for protein farnesylation in T. brucei. In this study we show KinesinCaaX is farnesylated using radiolabeling studies and that farnesylation is dependent on the CVIM motif. Using RNA interference, we show KinesinCaaX is essential for T. brucei proliferation. Additionally RNAi KinesinCaaX depleted T. brucei are 4 fold more sensitive to the protein farneysltransferase (PFT) inhibitor LN-59, suggesting that KinesinCaaX is a target of PFT inhibitors' action to block proliferation of T. brucei. Using tetracycline-induced exogenous tagged KinesinCaaX and KinesinCVIMdeletion (non-farnesylated Kinesin) expression lines in T. brucei, we demonstrate KinesinCaaX is farnesylated in T. brucei cells and this farnesylation has functional effects. In cells expressing a CaaX-deleted version of Kinesin, the localization is more diffuse which suggests correct localization depends on farnesylation. Through our investigation of cell cycle, nucleus and kinetoplast quantitation and immunofluorescence assays an important role is suggested for KinesinCaaX in the separation of nuclei and kinetoplasts during and after they have been replicated. Taken together, our work suggests KinesinCaaX is a target of PFT inhibition of T. brucei cell proliferation and KinesinCaaX functions through both the motor and farnesyl groups
Editorial: Celebrating Microbial Diversity: The Many Cell Cycles of Eukaryotic Microbes.
Editorial on the Research Topic
Celebrating Microbial Diversity: The Many Cell Cycles of Eukaryotic MicrobesCM: ERC research grant ‘Plasmocycle’. ZL: NIH R01 grant
AI101437. MB: Swiss National Science Foundation 31003A_179321
Self-distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Text recognition methods are gaining rapid development. Some advanced
techniques, e.g., powerful modules, language models, and un- and
semi-supervised learning schemes, consecutively push the performance on public
benchmarks forward. However, the problem of how to better optimize a text
recognition model from the perspective of loss functions is largely overlooked.
CTC-based methods, widely used in practice due to their good balance between
performance and inference speed, still grapple with accuracy degradation. This
is because CTC loss emphasizes the optimization of the entire sequence target
while neglecting to learn individual characters. We propose a self-distillation
scheme for CTC-based model to address this issue. It incorporates a framewise
regularization term in CTC loss to emphasize individual supervision, and
leverages the maximizing-a-posteriori of latent alignment to solve the
inconsistency problem that arises in distillation between CTC-based models. We
refer to the regularized CTC loss as Distillation Connectionist Temporal
Classification (DCTC) loss. DCTC loss is module-free, requiring no extra
parameters, longer inference lag, or additional training data or phases.
Extensive experiments on public benchmarks demonstrate that DCTC can boost text
recognition model accuracy by up to 2.6%, without any of these drawbacks.Comment: Ziyin Zhang and Ning Lu are co-first author
Detection and Elimination of Bathymetric Outliers in Multibeam Echosounder System Based on Robust Multi - quadric Method and Median Parameter Model
Re-channelization of turbidity currents in South China Sea abyssal plain due to seamounts and ridges
Turbidity currents can be characterized as net-erosive, net-depositional or net-bypassing. Whether a flow is erosive, depositional or bypasses depends on the flow velocity, concentration and size but these can also be impacted by external controls such as the degree of confinement, slope gradient and substrate type and erodibility. Our understanding of the relative importance of these controls comes from laboratory experiments and numerical modelling, as well as from field data due to the proliferation of high-resolution 3D seismic and bathymetric data, as well as the outcrop and rock record. In this study, based on extensive multibeam and seismic reflection surveys in combination with International Ocean Discovery Program cores from the South China Sea, we document a new mechanism of turbidity current transformation from depositional to erosive resulting in channel incision. We show how confinement by seamounts and bedrock highs of previously unconfined turbidity currents has resulted in the development of seafloor channels. These channels are inferred to be the result of confinement of flows, which have traversed the abyssal plain, leading to flow acceleration allowing them to erode the seafloor substrate. This interpretation is further supported by the coarsening of flow deposits within the area of the seamounts, indicating that confinement has increased flow competency, allowing turbidity currents to carry larger volumes of coarse sediment which has been deposited in this region. This basin-scale depositional pattern suggests that pre-established basin topography can have an important control on sedimentation which can impact characteristics such as potential hydrocarbon storage
- …