4,659 research outputs found
Improving Multi-label Classification Performance on Imbalanced Datasets Through SMOTE Technique and Data Augmentation Using IndoBERT Model
Sentiment and emotion analysis is a common classification task aimed at enhancing the benefit and comfort of consumers of a product. However, the data obtained often lacks balance between each class or aspect to be analyzed, commonly known as an imbalanced dataset. Imbalanced datasets are frequently challenging in machine learning tasks, particularly text datasets. Our research tackles imbalanced datasets using two techniques, namely SMOTE and Augmentation. In the SMOTE technique, text datasets need to undergo numerical representation using TF-IDF. The classification model employed is the IndoBERT model. Both oversampling techniques can address data imbalance by generating synthetic and new data. The newly created dataset enhances the classification model's performance. With the Augmentation technique, the classification model's performance improves by up to 20%, with accuracy reaching 78%, precision at 85%, recall at 82%, and an F1-score of 83%. On the other hand, using the SMOTE technique, the evaluation results achieve the best values between the two techniques, enhancing the model's accuracy to a high 82% with precision at 87%, recall at 85%, and an F1-score of 86%
Essays on human capital, sorting, and wages
This thesis is composed of three chapters that study vital linkages in the labour market; the accumulation of human capital on the job, the sorting between workers and firms and the wages that arise through these processes.
In Chapter 1, I develop a dynamic model of sorting between workers and firms in which it is possible to endogenously invest in the worker's human capital. The capability of a worker-firm pair to produce both tradeable output and further human capital may depend non-parametrically on both the worker's current human capital and firm type. Supermodularity of the technology with respect to the types does not suffice for strict positive assortative matching (PAM) in the competitive equilibrium; much stronger assumptions must be imposed. If high-productivity firms are better at training and production is concave in human capital, then PAM is not guaranteed even if production is supermodular in the types. In particular, with enough concavity, the importance of getting low-skilled workers paired with the best firms may outweigh the effects of supermodularity. With simple examples, it is shown that randomisation in the matching process may be an endogenous outcome, even in the absence of search or informational frictions. I prove that under weaker conditions, it is sometimes possible to determine whether the correlation between worker and firm type will be positive or negative, without any knowledge of the distribution of types. Furthermore, I prove that under some conditions, workers sort in a manner such that the highest skilled workers see their wages increase at the fastest rate, giving firms a highly active role in the dispersion of wages and inequality over the life cycle.
Chapter 2 builds off the ideas of the first chapter but uses a data-driven, quantitative approach. This is done by adding some more realistic features i.e. search frictions and firm-specific human capital, and taking this model to the data. I document employer-provided training for full-time employed workers in the UK using an "effective training" measure that weighs off different types of self-reported work-related training. This form of training tends to be higher in already highly educated workers and is provided in greater amounts at larger firms. Moreover, occupations and industries that tend to pay higher wages also tend to provide more training; this is consistent with the idea that training enhances the productivity of the worker and that workers with high earning ability may sort into high training environments. In conjunction with these findings, I develop a search model of the labour market that includes heterogeneity in both workers and firms. Workers vary in their level of human capital and firms vary in productivity. Worker-firm pairs can increase the worker's human capital at the cost of losing output. I show that this framework can replicate key facts from the data; namely, higher educated workers receive more training throughout their lifetime and earn more, and that the firms that pay higher wages also provide more training. Finally, the model features inefficiently low human capital investment due to the social returns not being fully internalised under random search; a policy of subsidising low-skilled young workers covered by income taxation is shown to improve aggregate welfare and social mobility in the model.
In Chapter 3, which is a co-authored project with Andy Snell, Heiko Stüber, and Jonathan Thomas, we document distinctive empirical features of wage pass-through in Germany that are consistent with a Thomas-Worrall wage contracting framework in the presence of both idiosyncratic and nonstationary aggregate productivity components. These empirical features are hard to reconcile with the
predictions of search models based on period-by-period Nash bargaining over
match surplus and with the predictions of financial models where risk-neutral
firms may costlessly shield risk-averse workers from idiosyncratic shocks
(Guiso, Pistaferri et al. 2005)
Robust interventions in network epidemiology
Which individual should we vaccinate to minimize the spread of a disease? Designing optimal interventions of this kind can be formalized as an optimization problem on networks, in which we have to select a budgeted number of dynamically important nodes to receive treatment that optimizes a dynamical outcome. Describing this optimization problem requires specifying the network, a model of the dynamics, and an objective for the outcome of the dynamics. In real-world contexts, these inputs are vulnerable to misspecification---the network and dynamics must be inferred from data, and the decision-maker must operationalize some (potentially abstract) goal into a mathematical objective function. Moreover, the tools to make reliable inferences---on the dynamical parameters, in particular---remain limited due to computational problems and issues of identifiability. Given these challenges, models thus remain more useful for building intuition than for designing actual interventions. This thesis seeks to elevate complex dynamical models from intuition-building tools to methods for the practical design of interventions.
First, we circumvent the inference problem by searching for robust decisions that are insensitive to model misspecification.If these robust solutions work well across a broad range of structural and dynamic contexts, the issues associated with accurately specifying the problem inputs are largely moot. We explore the existence of these solutions across three facets of dynamic importance common in network epidemiology.
Second, we introduce a method for analytically calculating the expected outcome of a spreading process under various interventions. Our method is based on message passing, a technique from statistical physics that has received attention in a variety of contexts, from epidemiology to statistical inference.We combine several facets of the message-passing literature for network epidemiology.Our method allows us to test general probabilistic, temporal intervention strategies (such as seeding or vaccination). Furthermore, the method works on arbitrary networks without requiring the network to be locally tree-like .This method has the potential to improve our ability to discriminate between possible intervention outcomes.
Overall, our work builds intuition about the decision landscape of designing interventions in spreading dynamics. This work also suggests a way forward for probing the decision-making landscape of other intervention contexts. More broadly, we provide a framework for exploring the boundaries of designing robust interventions with complex systems modeling tools
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
Exploring the complementarity between foreign technology, embedded technology and increase of productive capacity
This study analyzes the complementarity of foreign technology acquired under license agreements, technology embedded in machinery and equipment and increase in a company’s productive capacity. We use panel data on Brazilian manufacturing companies from the World Bank Surveys. We used the random effects models, estimated by maximum likelihood. The results indicate that foreign technology, embedded technology and increase of productive capacity have a positive and significant impact on labor productivity. The complementarity test reveals that the relationship between the two technologies analyzed is conditionally substitutive and that the relationship between each of these technologies and increase of productive capacity is conditionally complementary.Junta de Extremadura | Ref. GR1809
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
For One and All: Individual and Group Fairness in the Allocation of Indivisible Goods
Fair allocation of indivisible goods is a well-explored problem.
Traditionally, research focused on individual fairness - are individual agents
satisfied with their allotted share? - and group fairness - are groups of
agents treated fairly? In this paper, we explore the coexistence of individual
envy-freeness (i-EF) and its group counterpart, group weighted envy-freeness
(g-WEF), in the allocation of indivisible goods. We propose several
polynomial-time algorithms that provably achieve i-EF and g-WEF simultaneously
in various degrees of approximation under three different conditions on the
agents' (i) when agents have identical additive valuation functions, i-EFX and
i-WEF1 can be achieved simultaneously; (ii) when agents within a group share a
common valuation function, an allocation satisfying both i-EF1 and g-WEF1
exists; and (iii) when agents' valuations for goods within a group differ, we
show that while maintaining i-EF1, we can achieve a 1/3-approximation to
ex-ante g-WEF1. Our results thus provide a first step towards connecting
individual and group fairness in the allocation of indivisible goods, in hopes
of its useful application to domains requiring the reconciliation of diversity
with individual demands.Comment: Appears in the 22nd International Conference on Autonomous Agents and
Multiagent Systems (AAMAS), 202
Data-efficient neural network training with dataset condensation
The state of the art in many data driven fields including computer vision and natural language processing typically relies on training larger models on bigger data. It is reported by OpenAI that the computational cost to achieve the state of the art doubles every 3.4 months in the deep learning era. In contrast, the GPU computation power doubles every 21.4 months, which is significantly slower. Thus, advancing deep learning performance by consuming more hardware resources is not sustainable. How to reduce the training cost while preserving the generalization performance is a long standing goal in machine learning. This thesis investigates a largely under-explored while promising solution - dataset condensation which aims to condense a large training set into a small set of informative synthetic samples for training deep models and achieve close performance to models trained on the original dataset. In this thesis, we investigate how to condense image datasets for classification tasks. We propose three methods for image dataset condensation. Our methods can be applied to condense other kinds of datasets for different learning tasks, such as text data, graph data and medical images, and we discuss it in Section 6.1.
First, we propose a principled method that formulates the goal of learning a small synthetic set as a gradient matching problem with respect to the gradients of deep neural network weights that are trained on the original and synthetic data. A new gradient/weight matching loss is designed for robust matching of different neural architectures. We evaluate its performance in several image classification benchmarks and explore the usage of our method in continual learning and neural architecture search.
In the second work, we propose to further improve the data-efficiency of training neural networks with synthetic data by enabling effective data augmentation. Specifically, we propose Differentiable Siamese Augmentation and learn better synthetic data that can be used more effectively with data augmentation and thus achieve better performance when training networks with data augmentation. Experiments verify that the proposed method obtains substantial gains over the state of the art.
While training deep models on the small set of condensed images can be extremely fast, their synthesis remains computationally expensive due to the complex bi-level optimization. Finally, we propose a simple yet effective method that synthesizes condensed images by matching feature distributions of the synthetic and original training images when being embedded by randomly sampled deep networks. Thanks to its efficiency, we apply our method to more realistic and larger datasets with sophisticated neural architectures and obtain a significant performance boost.
In summary, this manuscript presents several important contributions that improve data efficiency of training deep neural networks by condensing large datasets into significantly smaller synthetic ones. The innovations focus on principled methods based on gradient matching, higher data-efficiency with differentiable Siamese augmentation, and extremely simple and fast distribution matching without bilevel optimization. The proposed methods are evaluated on popular image classification datasets, namely MNIST, FashionMNIST, SVHN, CIFAR10/100 and TinyImageNet. The code is available at https://github.com/VICO-UoE/DatasetCondensation
Towards Open Temporal Graph Neural Networks
Graph neural networks (GNNs) for temporal graphs have recently attracted
increasing attentions, where a common assumption is that the class set for
nodes is closed. However, in real-world scenarios, it often faces the open set
problem with the dynamically increased class set as the time passes by. This
will bring two big challenges to the existing dynamic GNN methods: (i) How to
dynamically propagate appropriate information in an open temporal graph, where
new class nodes are often linked to old class nodes. This case will lead to a
sharp contradiction. This is because typical GNNs are prone to make the
embeddings of connected nodes become similar, while we expect the embeddings of
these two interactive nodes to be distinguishable since they belong to
different classes. (ii) How to avoid catastrophic knowledge forgetting over old
classes when learning new classes occurred in temporal graphs. In this paper,
we propose a general and principled learning approach for open temporal graphs,
called OTGNet, with the goal of addressing the above two challenges. We assume
the knowledge of a node can be disentangled into class-relevant and
class-agnostic one, and thus explore a new message passing mechanism by
extending the information bottleneck principle to only propagate class-agnostic
knowledge between nodes of different classes, avoiding aggregating conflictive
information. Moreover, we devise a strategy to select both important and
diverse triad sub-graph structures for effective class-incremental learning.
Extensive experiments on three real-world datasets of different domains
demonstrate the superiority of our method, compared to the baselines.Comment: ICLR 2023 Ora
- …