316 research outputs found
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (âAIâ) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics â and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the CatĂłlica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Large neural models (such as Transformers) achieve state-of-the-art
performance for information retrieval (IR). In this paper, we aim to improve
distillation methods that pave the way for the resource-efficient deployment of
such models in practice. Inspired by our theoretical analysis of the
teacher-student generalization gap for IR models, we propose a novel
distillation approach that leverages the relative geometry among queries and
documents learned by the large teacher model. Unlike existing teacher
score-based distillation methods, our proposed approach employs embedding
matching tasks to provide a stronger signal to align the representations of the
teacher and student models. In addition, it utilizes query generation to
explore the data manifold to reduce the discrepancies between the student and
the teacher where training data is sparse. Furthermore, our analysis also
motivates novel asymmetric architectures for student models which realizes
better embedding alignment without increasing online inference cost. On
standard benchmarks like MSMARCO, we show that our approach successfully
distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to
1/10th size asymmetric students that can retain 95-97% of the teacher
performance
Recommended from our members
Exploring the socioeconomic and environmental factors influencing smallholder macadamia production and productivity in Malawi.
Macadamia (Macadamia integrifolia Maiden & Betche) is a highly valued crop in Malawi. The crop is a vital source of food security and ecosystem services, and its high-export cash value makes it a key contributor to the country's economy. Malawi ranks seventh in global macadamia production, comprising two subsectors: smallholders and commercial estates. However, significant yield gaps have been reported between smallholder and commercial estate producers. While commercial estates achieve higher average annual tree yields (30 kg), smallholder yields remain consistently low, averaging at or below 10 kg tree-1 year-1. Improving macadamia productivity among smallholders can help reduce poverty, improve household food security, and promote economic growth in Malawi.
Despite the significant contributions of smallholders in the Malawian macadamia subsector, research on the factors influencing the crop's productivity has primarily focused on commercial estate production. To address this knowledge gap, this Ph.D thesis focuses on smallholder macadamia production in Malawi. The thesis examines the socioeconomic characteristics of smallholder macadamia farmers, including demographics, cultivar preferences, and production constraints. Secondly, it evaluates the climatic factors influencing smallholder macadamia production and predicts the current and future suitable geographical areas for the crop. Lastly, it assesses the soil fertility status of smallholder macadamia farms in relation to macadamia production requirements.
Results of this study reveal that the majority (62%) of macadamia smallholders are over 50 years of age and consider farming their main occupation. However, this poses significant risks to the macadamia subsector, as older farmers are risk-averse and less innovative, hindering their willingness to adopt new agricultural technologies and ability to learn. Regarding cultivar preferences, the study finds that smallholder macadamia farmers prefer high-yielding cultivars with superior nut qualities, such as large and heavy nuts, and extended flowering periods. The most preferred macadamia cultivars in descending order are Hawaiian Agricultural Experimental Station (HAES) 660, 800, 816, and 246, which are the "core" of established cultivars in Malawi. The study identifies insect pests, diseases, market availability, strong winds, and a lack of agricultural extension services as the most significant challenges affecting smallholder macadamia farmers.
The study's suitability analysis reveals that the ensemble model has an excellent fit and high performance in predicting the current agro-climatically suitable areas for macadamia production (AUC = 0.90). The findings show that precipitation related variables (60.2%) are more important in determining the suitable areas for growing macadamia than temperature related variables (39.8%). The model results show that 57% (53,925 km2) of Malawi is currently suitable for macadamia cultivation, with the central region having the highest suitability (25.8%, 24,327 km2) and the southern region the lowest (10.7%, 10,257 km2). Optimal suitability (26%, 24,565 km2) is observed in the highland areas with elevations ranging from 1000â1400 metres above sea level (m.a.s.l.). Under the intermediate emission scenario (RCP 4.5) and the pessimistic scenario (RCP 8.5), the impact models predict net losses of 18% (17,015 km2) and 21.6% (20,414 km2), respectively, in the extent of suitable areas for macadamia in the 2050s.
The results of the soil fertility analysis indicate suboptimal fertility among the sampled macadamia farms. The majority of the soils are strongly acidic and deficient in essential nutrients required for the healthy growth of macadamia trees. Moreover, the average cation exchange capacity (1.67 cmol (+) kg-1) and the soil organic matter content (†1%) are below the minimum optimal levels required for macadamia trees. These findings indicate that soil fertility is one of the primary limiting factors to the crop's productivity, even in areas with suitable climatic conditions. Therefore, addressing the soil fertility issues is crucial to improving the land suitability of the smallholder farms for macadamia, which can lead to optimal yields.
This study extends the frontiers of knowledge concerning the macadamia subsector in Malawi by providing insights into the smallholder macadamia farming systems, including demographics, cultivar preferences, and production constraints. It also provides novel empirical evidence on the climate factors that influence the suitability of rainfed macadamia cultivation and identifies current and future suitable growing areas in the country. Additionally, the study addresses the research gap on the soil fertility status of Malawian smallholder macadamia farms. Therefore, the findings of this research have practical implications for various areas such as macadamia cultivar introductions and breeding, land use planning, soil fertility management, and policy formulation for agricultural extension services, inputs, and marketing of the crop
Promoting Generalization for Exact Solvers via Adversarial Instance Augmentation
Machine learning has been successfully applied to improve the efficiency of
Mixed-Integer Linear Programming (MILP) solvers. However, the learning-based
solvers often suffer from severe performance degradation on unseen MILP
instances -- especially on large-scale instances from a perturbed environment
-- due to the limited diversity of training distributions. To tackle this
problem, we propose a novel approach, which is called Adversarial Instance
Augmentation and does not require to know the problem type for new instance
generation, to promote data diversity for learning-based branching modules in
the branch-and-bound (B&B) Solvers (AdaSolver). We use the bipartite graph
representations for MILP instances and obtain various perturbed instances to
regularize the solver by augmenting the graph structures with a learned
augmentation policy. The major technical contribution of AdaSolver is that we
formulate the non-differentiable instance augmentation as a contextual bandit
problem and adversarially train the learning-based solver and augmentation
policy, enabling efficient gradient-based training of the augmentation policy.
To the best of our knowledge, AdaSolver is the first general and effective
framework for understanding and improving the generalization of both
imitation-learning-based (IL-based) and reinforcement-learning-based (RL-based)
B&B solvers. Extensive experiments demonstrate that by producing various
augmented instances, AdaSolver leads to a remarkable efficiency improvement
across various distributions
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing
We propose an optimal iterative scheme for federated transfer learning, where
a central planner has access to datasets for the
same learning model . Our objective is to minimize the cumulative
deviation of the generated parameters across all
iterations from the specialized parameters
obtained for each dataset, while
respecting the loss function for the model produced by the
algorithm upon halting. We only allow for continual communication between each
of the specialized models (nodes/agents) and the central planner (server), at
each iteration (round). For the case where the model is a
finite-rank kernel regression, we derive explicit updates for the
regret-optimal algorithm. By leveraging symmetries within the regret-optimal
algorithm, we further develop a nearly regret-optimal heuristic that runs with
fewer elementary operations, where is the dimension of
the parameter space. Additionally, we investigate the adversarial robustness of
the regret-optimal algorithm showing that an adversary which perturbs
training pairs by at-most , across all training sets, cannot
reduce the regret-optimal algorithm's regret by more than
, where is the aggregate
number of training pairs. To validate our theoretical findings, we conduct
numerical experiments in the context of American option pricing, utilizing a
randomly generated finite-rank kernel.Comment: 54 pages, 3 figure
Meta-learning to optimise: loss functions and update rules
Meta-learning, aka âlearning to learnâ, aims to extract invariant meta-knowledge from a
group of tasks in order to improve the generalisation of the base models in the novel
tasks. The learned meta-knowledge takes various forms, such as neural architecture,
network initialization, loss function and optimisers. In this thesis, we study learning to
optimise through meta-learning with of main components, loss function learning and
optimiser learning. At a high level, those two components play important roles where
optimisers provide update rules to modify the model parameters through the gradient
information generated from the loss function. We work on the meta-modelâs re-usability
across tasks. In the ideal case, the learned meta-model should provide a âplug-and-playâ
drop-in which can be used without further modification or computational expense with
any new dataset or even new model architecture. We apply these ideas to address three
challenges in machine learning, namely improving the convergence rate of optimisers,
learning with noisy labels, and learning models that are robust to domain shift.
We first study how to meta-learn loss functions. Unlike most prior work parameterising
a loss function in a black-box fashion with neural networks, we meta-learn a Taylor
polynomial loss and apply it to improve the robustness of the base model to label
noise in the training data. The good performance of deep neural networks relies on
gold-stand labelled data. However, in practice, wrongly labelled data is common due
to human error and imperfect automatic annotation processes. We draw inspiration
from hand-designed losses that modify the training dynamic to reduce the impact of
noisy labels. Going beyond existing hand-designed robust losses, we develop a bi-level
optimisation meta-learner Automated Robust Loss (ARL) that discovers novel robust
losses that outperform the best prior hand-designed robust losses.
A second contribution, ITL, extends the loss function learning idea to the problem of
Domain Generalisation (DG). DG is the challenging scenario of deploying a model
trained on one data distribution to a novel data distribution. Compared to ARL where
the target loss function is optimised by a genetic-based algorithm, ITL benefits from
gradient-based optimisation of loss parameters. By leveraging the mathematical guarantee
from the Implicit Function Theorem, the hypergradient required to update the loss
can be efficiently computed without differentiating through the whole base model training
trajectory. This reduces the computational cost dramatically in the meta-learning
stage and accelerates the loss function learning process by providing a more accurate
hypergradient. Applying our learned loss to the DG problem, we are able to learn base
models that exhibit increased robustness to domain shift compared to the state-of-theart.
Importantly, the modular plug-and-play nature of our learned loss means that it
is simple to use, requiring just a few lines of code change to standard Empirical Risk
Minimisation (ERM) learners.
We finally study accelerating the optimisation process itself by designing a metalearning
algorithm that searches for efficient optimisers, which is termed MetaMD. We
tackle this problem by meta-learning Mirror Descent-based optimisers through learning
the strongly convex function parameterizing a Bregman divergence. While standard
meta-learners require a validation set to define a meta-objective for learning, MetaMD
instead optimises the convergence rate bound. The resulting learned optimiser uniquely
has mathematically guaranteed convergence and generalisation properties
Intelligent Data Analysis for Energy Management
Predictive data analysis has been identified as essential to support intelligent energy management for better energy sustainability and efficiency. Previous studies have showcased that predicted energy information can benefit consumers economically by optimising energy usage while assisting energy suppliers in efficiently planning power distribution and implementing DR energy management. Recent advances in the Internet of Things (IoT) and Information and Communication Technologies (ICT) simplify the collection of desired energy data streams for further informatics analysis. With such energy data, machine learning (ML) prevails to effectively infer future knowledge associated with online energy resource scheduling, e.g., renewable energy generation, load demands and electricity prices. Although some early efforts have been dedicated to incorporating ML into energy management, computation resource limitations and data scarcity are two pressing challenges for on-site predictive energy analysis. Due to privacy concerns, users prefer on-premise model establishment instead of placing the training task in the cloud and sharing sensitive energy data. But most ML algorithms rely heavily on solid computational resources and vast amounts of labelled data to succeed. Users are often unable to fulfil the requirements in real-world scenarios. To this end, this thesis uses different perspectives to propose several affordable solutions for performing on-demand intelligent data analysis on local resource-constrained devices. Also, three algorithm-specific training frameworks have been developed to solve data shortage by leveraging easily obtainable but extensive data sources based on transfer learning and federated learning. We implement our design under practical settings for photovoltaic (PV) power prediction and non-intrusive load monitoring (NILM) as case studies to fully evaluate their performances
Sample-Specific Debiasing for Better Image-Text Models
Self-supervised representation learning on image-text data facilitates
crucial medical applications, such as image classification, visual grounding,
and cross-modal retrieval. One common approach involves contrasting
semantically similar (positive) and dissimilar (negative) pairs of data points.
Drawing negative samples uniformly from the training data set introduces false
negatives, i.e., samples that are treated as dissimilar but belong to the same
class. In healthcare data, the underlying class distribution is nonuniform,
implying that false negatives occur at a highly variable rate. To improve the
quality of learned representations, we develop a novel approach that corrects
for false negatives. Our method can be viewed as a variant of debiased
constrastive learning that uses estimated sample-specific class probabilities.
We provide theoretical analysis of the objective function and demonstrate the
proposed approach on both image and paired image-text data sets. Our
experiments demonstrate empirical advantages of sample-specific debiasing
- âŠ