44 research outputs found
Algorithmisch basierte Entscheidungsunterstützungssysteme für die deutsche Kinder- und Jugendhilfe? Messages from Research
Der Artikel von Phillip Gillingham und Timo Ackermann thematisiert die Einführung von datenbasierten, elektronischen Entscheidungsunterstützungssystemen in der internationalen Kinder- und Jugendhilfelandschaft. Sie weisen auf gravierende konzeptionelle Probleme dieser Systeme hin, die dazu führen, dass sie weder zuverlässig Vorhersagen treffen, noch ethisch vertretbar sind. Vor diesem Hintergrund stellen sie die Frage, ob die zu erwartenden Kosten für die Entwicklung von entsprechenden Programmen für die deutsche Kinder- und Jugendhilfe nicht sinnvoller in eine weitere Professionalisierung der dort tätigen Fachkräfte und Organisationen investiert werden sollte
C-Rex: A Comprehensive System for Recommending In-Text Citations with Explanations
Finding suitable citations for scientific publications can be challenging and time-consuming. To this end, context-aware citation recommendation approaches that recommend publications as candidates for in-text citations have been developed. In this paper, we present C-Rex, a web-based demonstration system available at http://c-rex.org for context-aware citation recommendation based on the Neural Citation Network [5] and millions of publications from the Microsoft Academic Graph. Our system is one of the first online context-aware citation recommendation systems and the first to incorporate not only a deep learning recommendation approach, but also explanation components to help users better understand why papers were recommended. In our offline evaluation, our model performs similarly to the one presented in the original paper and can serve as a basic framework for further implementations. In our online evaluation, we found that the explanations of recommendations increased users’ satisfaction
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
Bayesian Cluster Enumeration Criterion for Unsupervised Learning
We derive a new Bayesian Information Criterion (BIC) by formulating the
problem of estimating the number of clusters in an observed data set as
maximization of the posterior probability of the candidate models. Given that
some mild assumptions are satisfied, we provide a general BIC expression for a
broad class of data distributions. This serves as a starting point when
deriving the BIC for specific distributions. Along this line, we provide a
closed-form BIC expression for multivariate Gaussian distributed variables. We
show that incorporating the data structure of the clustering problem into the
derivation of the BIC results in an expression whose penalty term is different
from that of the original BIC. We propose a two-step cluster enumeration
algorithm. First, a model-based unsupervised learning algorithm partitions the
data according to a given set of candidate models. Subsequently, the number of
clusters is determined as the one associated with the model for which the
proposed BIC is maximal. The performance of the proposed two-step algorithm is
tested using synthetic and real data sets.Comment: 14 pages, 7 figure
Group-Feature (Sensor) Selection With Controlled Redundancy Using Neural Networks
In this paper, we present a novel embedded feature selection method based on
a Multi-layer Perceptron (MLP) network and generalize it for group-feature or
sensor selection problems, which can control the level of redundancy among the
selected features or groups. Additionally, we have generalized the group lasso
penalty for feature selection to encompass a mechanism for selecting valuable
group features while simultaneously maintaining a control over redundancy. We
establish the monotonicity and convergence of the proposed algorithm, with a
smoothed version of the penalty terms, under suitable assumptions. Experimental
results on several benchmark datasets demonstrate the promising performance of
the proposed methodology for both feature selection and group feature selection
over some state-of-the-art methods
Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent
Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we propose to relax BMF continuously using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that it converges quickly, recovers the ground truth precisely, and estimates the simulated rank exactly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful