1,831 research outputs found
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
A Deep Learning System for Predicting Size and Fit in Fashion E-Commerce
Personalized size and fit recommendations bear crucial significance for any
fashion e-commerce platform. Predicting the correct fit drives customer
satisfaction and benefits the business by reducing costs incurred due to
size-related returns. Traditional collaborative filtering algorithms seek to
model customer preferences based on their previous orders. A typical challenge
for such methods stems from extreme sparsity of customer-article orders. To
alleviate this problem, we propose a deep learning based content-collaborative
methodology for personalized size and fit recommendation. Our proposed method
can ingest arbitrary customer and article data and can model multiple
individuals or intents behind a single account. The method optimizes a global
set of parameters to learn population-level abstractions of size and fit
relevant information from observed customer-article interactions. It further
employs customer and article specific embedding variables to learn their
properties. Together with learned entity embeddings, the method maps additional
customer and article attributes into a latent space to derive personalized
recommendations. Application of our method to two publicly available datasets
demonstrate an improvement over the state-of-the-art published results. On two
proprietary datasets, one containing fit feedback from fashion experts and the
other involving customer purchases, we further outperform comparable
methodologies, including a recent Bayesian approach for size recommendation.Comment: Published at the Thirteenth ACM Conference on Recommender Systems
(RecSys '19), September 16--20, 2019, Copenhagen, Denmar
Link Prediction in Complex Networks: A Survey
Link prediction in complex networks has attracted increasing attention from
both physical and computer science communities. The algorithms can be used to
extract missing information, identify spurious interactions, evaluate network
evolving mechanisms, and so on. This article summaries recent progress about
link prediction algorithms, emphasizing on the contributions from physical
perspectives and approaches, such as the random-walk-based methods and the
maximum likelihood methods. We also introduce three typical applications:
reconstruction of networks, evaluation of network evolving mechanism and
classification of partially labelled networks. Finally, we introduce some
applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure
Representing Conversations for Scalable Overhearing
Open distributed multi-agent systems are gaining interest in the academic
community and in industry. In such open settings, agents are often coordinated
using standardized agent conversation protocols. The representation of such
protocols (for analysis, validation, monitoring, etc) is an important aspect of
multi-agent applications. Recently, Petri nets have been shown to be an
interesting approach to such representation, and radically different approaches
using Petri nets have been proposed. However, their relative strengths and
weaknesses have not been examined. Moreover, their scalability and suitability
for different tasks have not been addressed. This paper addresses both these
challenges. First, we analyze existing Petri net representations in terms of
their scalability and appropriateness for overhearing, an important task in
monitoring open multi-agent systems. Then, building on the insights gained, we
introduce a novel representation using Colored Petri nets that explicitly
represent legal joint conversation states and messages. This representation
approach offers significant improvements in scalability and is particularly
suitable for overhearing. Furthermore, we show that this new representation
offers a comprehensive coverage of all conversation features of FIPA
conversation standards. We also present a procedure for transforming AUML
conversation protocol diagrams (a standard human-readable representation), to
our Colored Petri net representation
Supporting Skill Assessment in Learning Experiences Based on Serious Games Through Process Mining Techniques
Learning experiences based on serious games are employed in multiple contexts. Players carry out multiple interactions during the gameplay to solve the different challenges faced. Those interactions can be registered in logs as large data sets providing the assessment process with objective information about the skills employed. Most assessment methods in learning experiences based on serious games rely on manual approaches, which do not scalewell when the amount of data increases. We propose an automated method to analyse students’ interactions and assess their skills in learning experiences based on serious games. The method takes into account not only the final model obtained by the student, but also the process followed to obtain it, extracted from game logs. The assessment method groups students according to their in-game errors and ingame outcomes. Then, the models for the most and the least successful students are discovered using process mining techniques. Similarities in their behaviour are analysed through conformance checking techniques to compare all the students with the most successful ones. Finally, the similarities found are quantified to build a classification of the students’ assessments. We have employed this method with Computer Science students playing a serious game to solve design problems in a course on databases. The findings show that process mining techniques can palliate the limitations of skill assessment methods in game-based learning experiences
The role of ontology in information management
The question posed in this thesis is how the use of ontologies by information systems affects their development and their performance. Several aspects about ontologies are presented, namely design and implementation issues, representational languages, and tools for ontology manipulation. The effects of the combination of ontologies and information systems are then investigated. An ontology-based tool to identify email message features is presented, and its implementation and execution details are discussed. The use of ontologies by information systems provides a better understanding about their requirements, reduces their development time, and supports knowledge management during execution time
Probabilistic models for mining imbalanced relational data
Most data mining and pattern recognition techniques are designed for learning from at data files with the assumption of equal populations per class. However, most real-world data are stored as rich relational databases that generally have imbalanced class distribution. For such domains, a rich relational technique is required to accurately model the different objects and relationships in the domain, which can not be easily represented as a set of simple attributes, and at the same time handle the imbalanced class problem.Motivated by the significance of mining imbalanced relational databases that represent the majority of real-world data, learning techniques for mining imbalanced relational domains are investigated. In this thesis, the employment of probabilistic models in mining relational databases is explored. In particular, the Probabilistic Relational Models (PRMs) that were proposed as an extension of the attribute-based Bayesian Networks. The effectiveness of PRMs in mining real-world databases was explored by learning PRMs from a real-world university relational database. A visual data mining tool is also proposed to aid the interpretation of the outcomes of the PRM learned models.Despite the effectiveness of PRMs in relational learning, the performance of PRMs as predictive models is significantly hindered by the imbalanced class problem. This is due to the fact that PRMs share the assumption common to other learning techniques of relatively balanced class distributions in the training data. Therefore, this thesis proposes a number of models utilizing the effectiveness of PRMs in relational learning and extending it for mining imbalanced relational domains.The first model introduced in this thesis examines the problem of mining imbalanced relational domains for a single two-class attribute. The model is proposed by enriching the PRM learning with the ensemble learning technique. The premise behind this model is that an ensemble of models would attain better performance than a single model, as misclassification committed by one of the models can be often correctly classified by others.Based on this approach, another model is introduced to address the problem of mining multiple imbalanced attributes, in which it is important to predict several attributes rather than a single one. In this model, the ensemble bagging sampling approach is exploited to attain a single model for mining several attributes. Finally, the thesis outlines the problem of imbalanced multi-class classification and introduces a generalized framework to handle this problem for both relational and non-relational domains
- …