17,535 research outputs found
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
Advancing Model Pruning via Bi-level Optimization
The deployment constraints in practical applications necessitate the pruning
of large-scale deep learning models, i.e., promoting their weight sparsity. As
illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the
potential of improving their generalization ability. At the core of LTH,
iterative magnitude pruning (IMP) is the predominant pruning method to
successfully find 'winning tickets'. Yet, the computation cost of IMP grows
prohibitively as the targeted pruning ratio increases. To reduce the
computation overhead, various efficient 'one-shot' pruning methods have been
developed, but these schemes are usually unable to find winning tickets as good
as IMP. This raises the question of how to close the gap between pruning
accuracy and pruning efficiency? To tackle it, we pursue the algorithmic
advancement of model pruning. Specifically, we formulate the pruning problem
from a fresh and novel viewpoint, bi-level optimization (BLO). We show that the
BLO interpretation provides a technically-grounded optimization base for an
efficient implementation of the pruning-retraining learning paradigm used in
IMP. We also show that the proposed bi-level optimization-oriented pruning
method (termed BiP) is a special class of BLO problems with a bi-linear problem
structure. By leveraging such bi-linearity, we theoretically show that BiP can
be solved as easily as first-order optimization, thus inheriting the
computation efficiency. Through extensive experiments on both structured and
unstructured pruning with 5 model architectures and 4 data sets, we demonstrate
that BiP can find better winning tickets than IMP in most cases, and is
computationally as efficient as the one-shot pruning schemes, demonstrating 2-7
times speedup over IMP for the same level of model accuracy and sparsity.Comment: Thirty-sixth Conference on Neural Information Processing Systems
(NeurIPS 2022
Corporate Social Responsibility: the institutionalization of ESG
Understanding the impact of Corporate Social Responsibility (CSR) on firm performance as it relates to industries reliant on technological innovation is a complex and perpetually evolving challenge. To thoroughly investigate this topic, this dissertation will adopt an economics-based structure to address three primary hypotheses. This structure allows for each hypothesis to essentially be a standalone empirical paper, unified by an overall analysis of the nature of impact that ESG has on firm performance. The first hypothesis explores the evolution of CSR to the modern quantified iteration of ESG has led to the institutionalization and standardization of the CSR concept. The second hypothesis fills gaps in existing literature testing the relationship between firm performance and ESG by finding that the relationship is significantly positive in long-term, strategic metrics (ROA and ROIC) and that there is no correlation in short-term metrics (ROE and ROS). Finally, the third hypothesis states that if a firm has a long-term strategic ESG plan, as proxied by the publication of CSR reports, then it is more resilience to damage from controversies. This is supported by the finding that pro-ESG firms consistently fared better than their counterparts in both financial and ESG performance, even in the event of a controversy. However, firms with consistent reporting are also held to a higher standard than their nonreporting peers, suggesting a higher risk and higher reward dynamic. These findings support the theory of good management, in that long-term strategic planning is both immediately economically beneficial and serves as a means of risk management and social impact mitigation. Overall, this contributes to the literature by fillings gaps in the nature of impact that ESG has on firm performance, particularly from a management perspective
Neural Architecture Search: Insights from 1000 Papers
In the past decade, advances in deep learning have resulted in breakthroughs
in a variety of areas, including computer vision, natural language
understanding, speech recognition, and reinforcement learning. Specialized,
high-performing neural architectures are crucial to the success of deep
learning in these areas. Neural architecture search (NAS), the process of
automating the design of neural architectures for a given task, is an
inevitable next step in automating machine learning and has already outpaced
the best human-designed architectures on many tasks. In the past few years,
research in NAS has been progressing rapidly, with over 1000 papers released
since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized
and comprehensive guide to neural architecture search. We give a taxonomy of
search spaces, algorithms, and speedup techniques, and we discuss resources
such as benchmarks, best practices, other surveys, and open-source libraries
Isotopic piecewise affine approximation of algebraic or varieties
We propose a novel sufficient condition establishing that a piecewise affine
variety has the same topology as a variety of the sphere defined
by positively homogeneous functions. This covers the case of
varieties in the projective space . We prove that this condition
is sufficient in the case of codimension one and arbitrary dimension. We
describe an implementation working for homogeneous polynomials in arbitrary
dimension and codimension and give experimental evidences that our condition
might still be sufficient in codimension greater than one
Perfect is the enemy of test oracle
Automation of test oracles is one of the most challenging facets of software
testing, but remains comparatively less addressed compared to automated test
input generation. Test oracles rely on a ground-truth that can distinguish
between the correct and buggy behavior to determine whether a test fails
(detects a bug) or passes. What makes the oracle problem challenging and
undecidable is the assumption that the ground-truth should know the exact
expected, correct, or buggy behavior. However, we argue that one can still
build an accurate oracle without knowing the exact correct or buggy behavior,
but how these two might differ. This paper presents SEER, a learning-based
approach that in the absence of test assertions or other types of oracle, can
determine whether a unit test passes or fails on a given method under test
(MUT). To build the ground-truth, SEER jointly embeds unit tests and the
implementation of MUTs into a unified vector space, in such a way that the
neural representation of tests are similar to that of MUTs they pass on them,
but dissimilar to MUTs they fail on them. The classifier built on top of this
vector representation serves as the oracle to generate "fail" labels, when test
inputs detect a bug in MUT or "pass" labels, otherwise. Our extensive
experiments on applying SEER to more than 5K unit tests from a diverse set of
open-source Java projects show that the produced oracle is (1) effective in
predicting the fail or pass labels, achieving an overall accuracy, precision,
recall, and F1 measure of 93%, 86%, 94%, and 90%, (2) generalizable, predicting
the labels for the unit test of projects that were not in training or
validation set with negligible performance drop, and (3) efficient, detecting
the existence of bugs in only 6.5 milliseconds on average.Comment: Published in ESEC/FSE 202
Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics
Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts.
In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact -values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited.
In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical in least squares regression.
In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions
Deep Transfer Learning Applications in Intrusion Detection Systems: A Comprehensive Review
Globally, the external Internet is increasingly being connected to the
contemporary industrial control system. As a result, there is an immediate need
to protect the network from several threats. The key infrastructure of
industrial activity may be protected from harm by using an intrusion detection
system (IDS), a preventive measure mechanism, to recognize new kinds of
dangerous threats and hostile activities. The most recent artificial
intelligence (AI) techniques used to create IDS in many kinds of industrial
control networks are examined in this study, with a particular emphasis on
IDS-based deep transfer learning (DTL). This latter can be seen as a type of
information fusion that merge, and/or adapt knowledge from multiple domains to
enhance the performance of the target task, particularly when the labeled data
in the target domain is scarce. Publications issued after 2015 were taken into
account. These selected publications were divided into three categories:
DTL-only and IDS-only are involved in the introduction and background, and
DTL-based IDS papers are involved in the core papers of this review.
Researchers will be able to have a better grasp of the current state of DTL
approaches used in IDS in many different types of networks by reading this
review paper. Other useful information, such as the datasets used, the sort of
DTL employed, the pre-trained network, IDS techniques, the evaluation metrics
including accuracy/F-score and false alarm rate (FAR), and the improvement
gained, were also covered. The algorithms, and methods used in several studies,
or illustrate deeply and clearly the principle in any DTL-based IDS subcategory
are presented to the reader
Discovering the hidden structure of financial markets through bayesian modelling
Understanding what is driving the price of a financial asset is a question that is currently mostly unanswered. In this work we go beyond the classic one step ahead prediction and instead construct models that create new information on the behaviour of these time series. Our aim is to get a better understanding of the hidden structures that drive the moves of each financial time series and thus the market as a whole.
We propose a tool to decompose multiple time series into economically-meaningful variables to explain the endogenous and exogenous factors driving their underlying variability. The methodology we introduce goes beyond the direct model forecast. Indeed, since our model continuously adapts its variables and coefficients, we can study the time series of coefficients and selected variables. We also present a model to construct the causal graph of relations between these time series and include them in the exogenous factors.
Hence, we obtain a model able to explain what is driving the move of both each specific time series and the market as a whole. In addition, the obtained graph of the time series provides new information on the underlying risk structure of this environment. With this deeper understanding of the hidden structure we propose novel ways to detect and forecast risks in the market. We investigate our results with inferences up to one month into the future using stocks, FX futures and ETF futures, demonstrating its superior performance according to accuracy of large moves, longer-term prediction and consistency over time. We also go in more details on the economic interpretation of the new variables and discuss the created graph structure of the market.Open Acces
- …