1,033 research outputs found
Fairness-aware Machine Learning in Educational Data Mining
Fairness is an essential requirement of every educational system, which is reflected in a variety of educational activities. With the extensive use of Artificial Intelligence (AI) and Machine Learning (ML) techniques in education, researchers and educators can analyze educational (big) data and propose new (technical) methods in order to support teachers, students, or administrators of (online) learning systems in the organization of teaching and learning. Educational data mining (EDM) is the result of the application and development of data mining (DM), and ML techniques to deal with educational problems, such as student performance prediction and student grouping. However, ML-based decisions in education can be based on protected attributes, such as race or gender, leading to discrimination of individual students or subgroups of students. Therefore, ensuring fairness in ML models also contributes to equity in educational systems. On the other hand, bias can also appear in the data obtained from learning environments. Hence, bias-aware exploratory educational data analysis is important to support unbiased decision-making in EDM.
In this thesis, we address the aforementioned issues and propose methods that mitigate discriminatory outcomes of ML algorithms in EDM tasks. Specifically, we make the following contributions:
We perform bias-aware exploratory analysis of educational datasets using Bayesian networks to identify the relationships among attributes in order to understand bias in the datasets. We focus the exploratory data analysis on features having a direct or indirect relationship with the protected attributes w.r.t. prediction outcomes.
We perform a comprehensive evaluation of the sufficiency of various group fairness measures in predictive models for student performance prediction problems. A variety of experiments on various educational datasets with different fairness measures are performed to provide users with a broad view of unfairness from diverse aspects.
We deal with the student grouping problem in collaborative learning. We introduce the fair-capacitated clustering problem that takes into account cluster fairness and cluster cardinalities. We propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain fair-capacitated clustering.
We introduce the multi-fair capacitated (MFC) students-topics grouping problem that satisfies students' preferences while ensuring balanced group cardinalities and maximizing the diversity of members regarding the protected attribute. We propose three approaches: a greedy heuristic approach, a knapsack-based approach using vanilla maximal 0-1 knapsack formulation, and an MFC knapsack approach based on group fairness knapsack formulation.
In short, the findings described in this thesis demonstrate the importance of fairness-aware ML in educational settings. We show that bias-aware data analysis, fairness measures, and fairness-aware ML models are essential aspects to ensure fairness in EDM and the educational environment.Ministry of Science and Culture of Lower Saxony/LernMINT/51410078/E
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Using hydrological models and digital soil mapping for the assessment and management of catchments: A case study of the Nyangores and Ruiru catchments in Kenya (East Africa)
Human activities on land have a direct and cumulative impact on water and other natural resources within a catchment. This land-use change can have hydrological consequences on the local and regional scales. Sound catchment assessment is not only critical to understanding processes and functions but also important in identifying priority management areas. The overarching goal of this doctoral thesis was to design a methodological framework for catchment assessment (dependent upon data availability) and propose practical catchment management strategies for sustainable water resources management. The Nyangores and Ruiru reservoir catchments located in Kenya, East Africa were used as case studies. A properly calibrated Soil and Water Assessment Tool (SWAT) hydrologic model coupled with a generic land-use optimization tool (Constrained Multi-Objective Optimization of Land-use Allocation-CoMOLA) was applied to identify and quantify functional trade-offs between environmental sustainability and food production in the ‘data-available’ Nyangores catchment. This was determined using a four-dimension objective function defined as (i) minimizing sediment load, (ii) maximizing stream low flow and (iii and iv) maximizing the crop yields of maize and soybeans, respectively.
Additionally, three different optimization scenarios, represented as i.) agroforestry (Scenario 1), ii.) agroforestry + conservation agriculture (Scenario 2) and iii.) conservation agriculture (Scenario 3), were compared. For the data-scarce Ruiru reservoir catchment, alternative methods using digital soil mapping of soil erosion proxies (aggregate stability using Mean Weight Diameter) and spatial-temporal soil loss analysis using empirical models (the Revised Universal Soil Loss Equation-RUSLE) were used. The lack of adequate data necessitated a data-collection phase which implemented the conditional Latin Hypercube Sampling. This sampling technique reduced the need for intensive soil sampling while still capturing spatial variability. The results revealed that for the Nyangores catchment, adoption of both agroforestry and conservation agriculture (Scenario 2) led to the smallest trade-off amongst the different objectives i.e. a 3.6% change in forests combined with 35% change in conservation agriculture resulted in the largest reduction in sediment loads (78%), increased low flow (+14%) and only slightly decreased crop yields (3.8% for both maize and soybeans). Therefore, the advanced use of hydrologic models with optimization tools allows for the simultaneous assessment of different outputs/objectives and is ideal for areas with adequate data to properly calibrate the model. For the Ruiru reservoir catchment, digital soil mapping (DSM) of aggregate stability revealed that susceptibility to erosion exists for cropland (food crops), tea and roadsides, which are mainly located in the eastern part of the catchment, as well as deforested areas on the western side. This validated that with limited soil samples and the use of computing power, machine learning and freely available covariates, DSM can effectively be applied in data-scarce areas. Moreover, uncertainty in the predictions can be incorporated using prediction intervals. The spatial-temporal analysis exhibited that bare land (which has the lowest areal proportion) was the largest contributor to erosion. Two peak soil loss periods corresponding to the two rainy periods of March–May and October–December were identified. Thus, yearly soil erosion risk maps misrepresent the true dimensions of soil loss with averages disguising areas of low and high potential. Also, a small portion of the catchment can be responsible for a large proportion of the total erosion. For both catchments, agroforestry (combining both the use of trees and conservation farming) is the most feasible catchment management strategy (CMS) for solving the major water quantity and quality problems. Finally, the key to thriving catchments aiming at both sustainability and resilience requires urgent collaborative action by all stakeholders. The necessary stakeholders in both Nyangores and Ruiru reservoir catchments must be involved in catchment assessment in order to identify the catchment problems, mitigation strategies/roles and responsibilities while keeping in mind that some risks need to be shared and negotiated, but so will the benefits.:TABLE OF CONTENTS
DECLARATION OF CONFORMITY........................................................................ i
DECLARATION OF INDEPENDENT WORK AND CONSENT ............................. ii
LIST OF PAPERS ................................................................................................. iii
ACKNOWLEDGEMENTS ..................................................................................... iv
THESIS AT A GLANCE ......................................................................................... v
SUMMARY ............................................................................................................ vi
List of Figures......................................................................................................... x
List of Tables........................................................................................................... x
ABBREVIATION..................................................................................................... xi
PART A: SYNTHESIS
1. INTRODUCTION ............................................................................................... 1
1.1 Catchment management ...................................................................................1
1.2 Tools to support catchment assessment and management ..............................4
1.3 Catchment management strategies (CMSs)......................................................9
1.4 Concept and research objectives.......................................................................11
2. MATERIAL AND METHODS................................................................................15
2.1. STUDY AREA ..................................................................................................15
2.1.1. Nyangores catchment ...................................................................................15
2.1.2. Ruiru reservoir catchment .............................................................................17
2.2. Using SWAT conceptual model and land-use optimization ..............................19
2.3. Using soil erosion proxies and empirical models ..............................................21
3. RESULTS AND DISCUSSION..............................................................................24
3.1. Assessing multi-metric calibration performance using the SWAT model...........25
3.2. Land-use optimization using SWAT-CoMOLA for the Nyangores catchment. ..26
3.3. Digital soil mapping of soil aggregate stability ..................................................28
3.4. Spatio-temporal analysis using the revised universal soil loss equation (RUSLE) 29
4. CRITICAL ASSESSMENT OF THE METHODS USED ......................................31
4.1. Assessing suitability of data for modelling and overcoming data challenges...31
4.2. Selecting catchment management strategies based on catchment assessment . 35
5. CONCLUSION AND RECOMMENDATIONS ....................................................36
6. REFERENCES ............................ .....................................................................38
PART B: PAPERS
PAPER I .................................................................................................................47
PAPER II ................................................................................................................59
PAPER III ...............................................................................................................74
PAPER IV ...............................................................................................................8
Distributed energy efficient channel allocation in underlay multicast D2D communications
In this paper, we address the optimization of the energy efficiency of underlay multicast device-to-device (D2MD) communications on cellular networks. In particular, we maximize the energy efficiency of both the global network and the individual users considering various fairness factors such as maximum power and minimum rate constraints. For this, we employ a canonical mixed-integer non-linear formulation of the joint power control and resource allocation problem. To cope with its NP-hard nature, we propose a two-stage semi-distributed solution. In the first stage, we find a stable, yet sub-optimal, channel allocation for D2MD groups
using a cooperative coalitional game framework that allows co-channel transmission over a set of shared resource blocks and/or transmission over several different channels per D2MD group. In the second stage, a central entity determines the optimal transmission power for each user in the system via fractional programming. We performed extensive simulations to analyze the resulting energy efficiency and attainable transmission rates. The results show that the performance of our semi-distributed approach is very close to that
obtained with a pure optimal centralized one.Ministerio de Ciencia, Innovación y Universidades | Ref. GO2EDGERED2018-102563-TAgencia Estatal de Investigación | Ref. TEC2017-85587-RAgencia Estatal de Investigación | Ref. RED2018-102563-
Learning in Repeated Multi-Unit Pay-As-Bid Auctions
Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, and
Procurement Auctions, which all involve the auctioning of homogeneous multiple
units, we consider the problem of learning how to bid in repeated multi-unit
pay-as-bid auctions. In each of these auctions, a large number of (identical)
items are to be allocated to the largest submitted bids, where the price of
each of the winning bids is equal to the bid itself. The problem of learning
how to bid in pay-as-bid auctions is challenging due to the combinatorial
nature of the action space. We overcome this challenge by focusing on the
offline setting, where the bidder optimizes their vector of bids while only
having access to the past submitted bids by other bidders. We show that the
optimal solution to the offline problem can be obtained using a polynomial time
dynamic programming (DP) scheme. We leverage the structure of the DP scheme to
design online learning algorithms with polynomial time and space complexity
under full information and bandit feedback settings. We achieve an upper bound
on regret of and respectively, where is the number of units demanded by the
bidder, is the total number of auctions, and is the size of
the discretized bid space. We accompany these results with a regret lower
bound, which match the linear dependency in . Our numerical results suggest
that when all agents behave according to our proposed no regret learning
algorithms, the resulting market dynamics mainly converge to a welfare
maximizing equilibrium where bidders submit uniform bids. Lastly, our
experiments demonstrate that the pay-as-bid auction consistently generates
significantly higher revenue compared to its popular alternative, the uniform
price auction.Comment: 51 pages, 12 Figure
Spatial-temporal domain charging optimization and charging scenario iteration for EV
Environmental problems have become increasingly serious around the world. With lower carbon emissions, Electric Vehicles (EVs) have been utilized on a large scale over the past few years. However, EVs are limited by battery capacity and require frequent charging. Currently, EVs suffer from long charging time and charging congestion. Therefore, EV charging optimization is vital to ensure drivers’ mobility. This study first presents a literature analysis of the current charging modes taxonomy to elucidate the advantages and disadvantages of different charging modes. In specific optimization, under plug-in charging mode, an Urgency First Charging (UFC) scheduling policy is proposed with collaborative optimization of the spatialtemporal domain. The UFC policy allows those EVs with charging urgency to get preempted charging services. As conventional plug-in charging mode is limited by the deployment of Charging Stations (CSs), this study further introduces and optimizes Vehicle-to-Vehicle (V2V) charging. This is aim to maximize the utilization of charging infrastructures and to balance the grid load. This proposed reservation-based V2V charging scheme optimizes pair matching of EVs based on minimized distance. Meanwhile, this V2V scheme allows more EVs get fully charged via minimized waiting time based parking lot allocation. Constrained by shortcomings (rigid location of CSs and slow charging power under V2V converters), a single charging mode can hardly meet a large number of parallel charging requests. Thus, this study further proposes a hybrid charging mode. This mode is to utilize the advantages of plug-in and V2V modes to alleviate the pressure on the grid. Finally, this study addresses the potential problems of EV charging with a view to further optimizing EV charging in subsequent studies
Utilitarian Welfare Optimization in the Generalized Vertex Coloring Games: An Implication to Venue Selection in Events Planning
We consider a general class of multi-agent games in networks, namely the
generalized vertex coloring games (G-VCGs), inspired by real-life applications
of the venue selection problem in events planning. Certain utility responding
to the contemporary coloring assignment will be received by each agent under
some particular mechanism, who, striving to maximize his own utility, is
restricted to local information thus self-organizing when choosing another
color. Our focus is on maximizing some utilitarian-looking welfare objective
function concerning the cumulative utilities across the network in a
decentralized fashion. Firstly, we investigate on a special class of the
G-VCGs, namely Identical Preference VCGs (IP-VCGs) which recovers the
rudimentary work by \cite{chaudhuri2008network}. We reveal its convergence even
under a completely greedy policy and completely synchronous settings, with a
stochastic bound on the converging rate provided. Secondly, regarding the
general G-VCGs, a greediness-preserved Metropolis-Hasting based policy is
proposed for each agent to initiate with the limited information and its
optimality under asynchronous settings is proved using theories from the
regular perturbed Markov processes. The policy was also empirically witnessed
to be robust under independently synchronous settings. Thirdly, in the spirit
of ``robust coloring'', we include an expected loss term in our objective
function to balance between the utilities and robustness. An optimal coloring
for this robust welfare optimization would be derived through a second-stage
MH-policy driven algorithm. Simulation experiments are given to showcase the
efficiency of our proposed strategy.Comment: 35 Page
The Complexity of Fair Division of Indivisible Items with Externalities
We study the computational complexity of fairly allocating a set of
indivisible items under externalities. In this recently-proposed setting, in
addition to the utility the agent gets from their bundle, they also receive
utility from items allocated to other agents. We focus on the extended
definitions of envy-freeness up to one item (EF1) and of envy-freeness up to
any item (EFX), and we provide the landscape of their complexity for several
different scenarios. We prove that it is NP-complete to decide whether there
exists an EFX allocation, even when there are only three agents, or even when
there are only six different values for the items. We complement these negative
results by showing that when both the number of agents and the number of
different values for items are bounded by a parameter the problem becomes
fixed-parameter tractable. Furthermore, we prove that two-valued and
binary-valued instances are equivalent and that EFX and EF1 allocations
coincide for this class of instances. Finally, motivated from real-life
scenarios, we focus on a class of structured valuation functions, which we term
agent/item-correlated. We prove their equivalence to the ``standard'' setting
without externalities. Therefore, all previous results for EF1 and EFX apply
immediately for these valuations
Computing Pareto-Optimal and Almost Envy-Free Allocations of Indivisible Goods
We study the problem of fair and efficient allocation of a set of indivisible
goods to agents with additive valuations using the popular fairness notions of
envy-freeness up to one good (EF1) and equitability up to one good (EQ1) in
conjunction with Pareto-optimality (PO). There exists a pseudo-polynomial time
algorithm to compute an EF1+PO allocation and a non-constructive proof of the
existence of allocations that are both EF1 and fractionally Pareto-optimal
(fPO), which is a stronger notion than PO. We present a pseudo-polynomial time
algorithm to compute an EF1+fPO allocation, thereby improving the earlier
results. Our techniques also enable us to show that an EQ1+fPO allocation
always exists when the values are positive and that it can be computed in
pseudo-polynomial time.
We also consider the class of -ary instances where is a constant,
i.e., each agent has at most different values for the goods. For such
instances, we show that an EF1+fPO allocation can be computed in strongly
polynomial time. When all values are positive, we show that an EQ1+fPO
allocation for such instances can be computed in strongly polynomial time.
Next, we consider instances where the number of agents is constant and show
that an EF1+PO (likewise, an EQ1+PO) allocation can be computed in polynomial
time. These results significantly extend the polynomial-time computability
beyond the known cases of binary or identical valuations.
We also design a polynomial-time algorithm that computes a Nash welfare
maximizing allocation when there are constantly many agents with constant many
different values for the goods. Finally, on the complexity side, we show that
the problem of computing an EF1+fPO allocation lies in the complexity class
PLS.Comment: 23 pages. A preliminary version appeared at AAAI 202
- …