20,635 research outputs found
Deep Learning for Cancer Prognosis Prediction Using Portrait Photos by StyleGAN Embedding
Survival prediction for cancer patients is critical for optimal treatment
selection and patient management. Current patient survival prediction methods
typically extract survival information from patients' clinical record data or
biological and imaging data. In practice, experienced clinicians can have a
preliminary assessment of patients' health status based on patients' observable
physical appearances, which are mainly facial features. However, such
assessment is highly subjective. In this work, the efficacy of objectively
capturing and using prognostic information contained in conventional portrait
photographs using deep learning for survival predication purposes is
investigated for the first time. A pre-trained StyleGAN2 model is fine-tuned on
a custom dataset of our cancer patients' photos to empower its generator with
generative ability suitable for patients' photos. The StyleGAN2 is then used to
embed the photographs to its highly expressive latent space. Utilizing the
state-of-the-art survival analysis models and based on StyleGAN's latent space
photo embeddings, this approach achieved a C-index of 0.677, which is notably
higher than chance and evidencing the prognostic value embedded in simple 2D
facial images. In addition, thanks to StyleGAN's interpretable latent space,
our survival prediction model can be validated for relying on essential facial
features, eliminating any biases from extraneous information like clothing or
background. Moreover, a health attribute is obtained from regression
coefficients, which has important potential value for patient care
EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Large neural models (such as Transformers) achieve state-of-the-art
performance for information retrieval (IR). In this paper, we aim to improve
distillation methods that pave the way for the resource-efficient deployment of
such models in practice. Inspired by our theoretical analysis of the
teacher-student generalization gap for IR models, we propose a novel
distillation approach that leverages the relative geometry among queries and
documents learned by the large teacher model. Unlike existing teacher
score-based distillation methods, our proposed approach employs embedding
matching tasks to provide a stronger signal to align the representations of the
teacher and student models. In addition, it utilizes query generation to
explore the data manifold to reduce the discrepancies between the student and
the teacher where training data is sparse. Furthermore, our analysis also
motivates novel asymmetric architectures for student models which realizes
better embedding alignment without increasing online inference cost. On
standard benchmarks like MSMARCO, we show that our approach successfully
distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to
1/10th size asymmetric students that can retain 95-97% of the teacher
performance
Ethical Questions Raised by AI-Supported Mentoring in Higher Education
Mentoring is a highly personal and individual process, in which mentees take advantage of
expertise and experience to expand their knowledge and to achieve individual goals. The
emerging use of AI in mentoring processes in higher education not only necessitates the
adherence to applicable laws and regulations (e.g., relating to data protection and nondiscrimination)
but further requires a thorough understanding of ethical norms, guidelines,
and unresolved issues (e.g., integrity of data, safety, and security of systems, and
confidentiality, avoiding bias, insuring trust in and transparency of algorithms).
Mentoring in Higher Education requires one of the highest degrees of trust, openness,
and social–emotional support, as much is at the stake for mentees, especially their
academic attainment, career options, and future life choices. However, ethical
compromises seem to be common when digital systems are introduced, and the
underlying ethical questions in AI-supported mentoring are still insufficiently addressed
in research, development, and application. One of the challenges is to strive for privacy and
data economy on the one hand, while Big Data is the prerequisite of AI-supported
environments on the other hand. How can ethical norms and general guidelines of
AIED be respected in complex digital mentoring processes? This article strives to start
a discourse on the relevant ethical questions and in this way raise awareness for the ethical
development and use of future data-driven, AI-supported mentoring environments in
higher education
Improving Heterogeneous Graph Learning with Weighted Mixed-Curvature Product Manifold
In graph representation learning, it is important that the complex geometric
structure of the input graph, e.g. hidden relations among nodes, is well
captured in embedding space. However, standard Euclidean embedding spaces have
a limited capacity in representing graphs of varying structures. A promising
candidate for the faithful embedding of data with varying structure is product
manifolds of component spaces of different geometries (spherical, hyperbolic,
or euclidean). In this paper, we take a closer look at the structure of product
manifold embedding spaces and argue that each component space in a product
contributes differently to expressing structures in the input graph, hence
should be weighted accordingly. This is different from previous works which
consider the roles of different components equally. We then propose
WEIGHTED-PM, a data-driven method for learning embedding of heterogeneous
graphs in weighted product manifolds. Our method utilizes the topological
information of the input graph to automatically determine the weight of each
component in product spaces. Extensive experiments on synthetic and real-world
graph datasets demonstrate that WEIGHTED-PM is capable of learning better graph
representations with lower geometric distortion from input data, and performs
better on multiple downstream tasks, such as word similarity learning, top-
recommendation, and knowledge graph embedding
The Globalization of Artificial Intelligence: African Imaginaries of Technoscientific Futures
Imaginaries of artificial intelligence (AI) have transcended geographies of the Global North and become increasingly entangled with narratives of economic growth, progress, and modernity in Africa. This raises several issues such as the entanglement of AI with global technoscientific capitalism and its impact on the dissemination of AI in Africa. The lack of African perspectives on the development of AI exacerbates concerns of raciality and inclusion in the scientific research, circulation, and adoption of AI. My argument in this dissertation is that innovation in AI, in both its sociotechnical imaginaries and political economies, excludes marginalized countries, nations and communities in ways that not only bar their participation in the reception of AI, but also as being part and parcel of its creation.
Underpinned by decolonial thinking, and perspectives from science and technology studies and African studies, this dissertation looks at how AI is reconfiguring the debate about development and modernization in Africa and the implications for local sociotechnical practices of AI innovation and governance. I examined AI in international development and industry across Kenya, Ghana, and Nigeria, by tracing Canada’s AI4D Africa program and following AI start-ups at AfriLabs. I used multi-sited case studies and discourse analysis to examine the data collected from interviews, participant observations, and documents.
In the empirical chapters, I first examine how local actors understand the notion of decolonizing AI and show that it has become a sociotechnical imaginary. I then investigate the political economy of AI in Africa and argue that despite Western efforts to integrate the African AI ecosystem globally, the AI epistemic communities in the continent continue to be excluded from dominant AI innovation spaces. Finally, I examine the emergence of a Pan-African AI imaginary and argue that AI governance can be understood as a state-building experiment in post-colonial Africa. The main issue at stake is that the lack of African perspectives in AI leads to negative impacts on innovation and limits the fair distribution of the benefits of AI across nations, countries, and communities, while at the same time excludes globally marginalized epistemic communities from the imagination and creation of AI
Mutual Wasserstein Discrepancy Minimization for Sequential Recommendation
Self-supervised sequential recommendation significantly improves
recommendation performance by maximizing mutual information with well-designed
data augmentations. However, the mutual information estimation is based on the
calculation of Kullback Leibler divergence with several limitations, including
asymmetrical estimation, the exponential need of the sample size, and training
instability. Also, existing data augmentations are mostly stochastic and can
potentially break sequential correlations with random modifications. These two
issues motivate us to investigate an alternative robust mutual information
measurement capable of modeling uncertainty and alleviating KL divergence
limitations. To this end, we propose a novel self-supervised learning framework
based on Mutual WasserStein discrepancy minimization MStein for the sequential
recommendation. We propose the Wasserstein Discrepancy Measurement to measure
the mutual information between augmented sequences. Wasserstein Discrepancy
Measurement builds upon the 2-Wasserstein distance, which is more robust, more
efficient in small batch sizes, and able to model the uncertainty of stochastic
augmentation processes. We also propose a novel contrastive learning loss based
on Wasserstein Discrepancy Measurement. Extensive experiments on four benchmark
datasets demonstrate the effectiveness of MStein over baselines. More
quantitative analyses show the robustness against perturbations and training
efficiency in batch size. Finally, improvements analysis indicates better
representations of popular users or items with significant uncertainty. The
source code is at https://github.com/zfan20/MStein.Comment: Updated with the correction of the asymmetric mistake on the mutual
information connectio
Multi-objective optimization-based collective opinion generation with fairness concern
open access articleThe generation of collective opinion based on probability distribution function (PDF) aggregation is gradually becoming a critical approach for tackling immense and delicate assessment and evaluation tasks in decision analysis. However, the existing collective opinion generation approaches fail to model the behavioral characteristics associated with individuals, and thus, cannot reflect the fairness concerns among them when they consciously or unconsciously incorporate their judgments on the fairness level of distribution into the formulations of individual opinions. In this study, we propose a multiobjective optimization-driven collective opinion generation approach that generalizes the bi-objective optimization-based PDF aggregation paradigm. In doing so, we adapt the notion of fairness concern utility function to characterize the influence of fairness inclusion and take its maximization as an additional objective, together with the criteria of consensus and confidence levels, to achieve in generating collective opinion. The formulation of fairness concern is then transformed into the congregation of individual fairness concern utilities in the use of aggregation functions. We regard the generalized extended Bonferroni mean (BM) as an elaborated framework for aggregating individual fairness concern utilities. In such way, we establish the concept of BM-type collective fairness concern utility to empower multiobjective optimization-driven collective opinion generation approach with the capacity of modeling different structures associated with the expert group with fairness concern. The application of the proposed fairness-aware framework in the maturity assessment of building information modeling demonstrates the effectiveness and efficiency of multiobjective optimization-driven approach for generating collective opinion when accomplishing complicated assessment and evaluation tasks with data scarcity
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
Bayesian Forecasting in Economics and Finance: A Modern Review
The Bayesian statistical paradigm provides a principled and coherent approach
to probabilistic forecasting. Uncertainty about all unknowns that characterize
any forecasting problem -- model, parameters, latent states -- is able to be
quantified explicitly, and factored into the forecast distribution via the
process of integration or averaging. Allied with the elegance of the method,
Bayesian forecasting is now underpinned by the burgeoning field of Bayesian
computation, which enables Bayesian forecasts to be produced for virtually any
problem, no matter how large, or complex. The current state of play in Bayesian
forecasting in economics and finance is the subject of this review. The aim is
to provide the reader with an overview of modern approaches to the field, set
in some historical context; and with sufficient computational detail given to
assist the reader with implementation.Comment: The paper is now published online at:
https://doi.org/10.1016/j.ijforecast.2023.05.00
Improvable Gap Balancing for Multi-Task Learning
In multi-task learning (MTL), gradient balancing has recently attracted more
research interest than loss balancing since it often leads to better
performance. However, loss balancing is much more efficient than gradient
balancing, and thus it is still worth further exploration in MTL. Note that
prior studies typically ignore that there exist varying improvable gaps across
multiple tasks, where the improvable gap per task is defined as the distance
between the current training progress and desired final training progress.
Therefore, after loss balancing, the performance imbalance still arises in many
cases. In this paper, following the loss balancing framework, we propose two
novel improvable gap balancing (IGB) algorithms for MTL: one takes a simple
heuristic, and the other (for the first time) deploys deep reinforcement
learning for MTL. Particularly, instead of directly balancing the losses in
MTL, both algorithms choose to dynamically assign task weights for improvable
gap balancing. Moreover, we combine IGB and gradient balancing to show the
complementarity between the two types of algorithms. Extensive experiments on
two benchmark datasets demonstrate that our IGB algorithms lead to the best
results in MTL via loss balancing and achieve further improvements when
combined with gradient balancing. Code is available at
https://github.com/YanqiDai/IGB4MTL.Comment: Accepted for the 39th Conference on Uncertainty in Artificial
Intelligence (UAI 2023
- …