943 research outputs found
On the Feature Discovery for App Usage Prediction in Smartphones
With the increasing number of mobile Apps developed, they are now closely
integrated into daily life. In this paper, we develop a framework to predict
mobile Apps that are most likely to be used regarding the current device status
of a smartphone. Such an Apps usage prediction framework is a crucial
prerequisite for fast App launching, intelligent user experience, and power
management of smartphones. By analyzing real App usage log data, we discover
two kinds of features: The Explicit Feature (EF) from sensing readings of
built-in sensors, and the Implicit Feature (IF) from App usage relations. The
IF feature is derived by constructing the proposed App Usage Graph (abbreviated
as AUG) that models App usage transitions. In light of AUG, we are able to
discover usage relations among Apps. Since users may have different usage
behaviors on their smartphones, we further propose one personalized feature
selection algorithm. We explore minimum description length (MDL) from the
training data and select those features which need less length to describe the
training data. The personalized feature selection can successfully reduce the
log size and the prediction time. Finally, we adopt the kNN classification
model to predict Apps usage. Note that through the features selected by the
proposed personalized feature selection algorithm, we only need to keep these
features, which in turn reduces the prediction time and avoids the curse of
dimensionality when using the kNN classifier. We conduct a comprehensive
experimental study based on a real mobile App usage dataset. The results
demonstrate the effectiveness of the proposed framework and show the predictive
capability for App usage prediction.Comment: 10 pages, 17 figures, ICDM 2013 short pape
Characterizing Impression-Aware Recommender Systems
Impression-aware recommender systems (IARS) are a type of recommenders that learn user preferences using their interactions and the recommendations (also known as impressions) shown to users. The community’s interest in this type of recommenders has steadily increased in recent years. To aid in characterizing this type of recommenders, we propose a theoretical framework to define IARS and classify the recommenders present in the state-of-the-art. We start this work by defining core concepts related to this type of recommenders, such as impressions and user feedback. Based on this theoretical framework, we identify and define three properties and three taxonomies that characterize IARS. Lastly, we undergo a systematic literature review where we discover and select papers belonging to the state-of-the-art. Our review analyzes papers under the properties and taxonomies we propose; we highlight the most and least common properties and taxonomies used in the literature, their relations, and their evolution over time, among others
Impressions in Recommender Systems: Present and Future
Impressions are a novel data source providing researchers and practitioners with more details about user interactions and their context. In particular, an impression contain the items shown on screen to users, alongside users' interactions toward such items. In recent years, interest in impressions has thrived, and more papers use impressions in recommender systems. Despite this, the literature does not contain a comprehensive review of the current topics and future directions. This work summarizes impressions in recommender systems under three perspectives: recommendation models, datasets with impressions, and evaluation methodologies. Then, we propose several future directions with an emphasis on novel approaches. This work is part of an ongoing review of impressions in recommender systems
Graph based Anomaly Detection and Description: A Survey
Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field
FairGen: Towards Fair Graph Generation
There have been tremendous efforts over the past decades dedicated to the
generation of realistic graphs in a variety of domains, ranging from social
networks to computer networks, from gene regulatory networks to online
transaction networks. Despite the remarkable success, the vast majority of
these works are unsupervised in nature and are typically trained to minimize
the expected graph reconstruction loss, which would result in the
representation disparity issue in the generated graphs, i.e., the protected
groups (often minorities) contribute less to the objective and thus suffer from
systematically higher errors. In this paper, we aim to tailor graph generation
to downstream mining tasks by leveraging label information and user-preferred
parity constraint. In particular, we start from the investigation of
representation disparity in the context of graph generative models. To mitigate
the disparity, we propose a fairness-aware graph generative model named
FairGen. Our model jointly trains a label-informed graph generation module and
a fair representation learning module by progressively learning the behaviors
of the protected and unprotected groups, from the `easy' concepts to the `hard'
ones. In addition, we propose a generic context sampling strategy for graph
generative models, which is proven to be capable of fairly capturing the
contextual information of each group with a high probability. Experimental
results on seven real-world data sets, including web-based graphs, demonstrate
that FairGen (1) obtains performance on par with state-of-the-art graph
generative models across six network properties, (2) mitigates the
representation disparity issues in the generated graphs, and (3) substantially
boosts the model performance by up to 17% in downstream tasks via data
augmentation
Current Challenges and Visions in Music Recommender Systems Research
Music recommender systems (MRS) have experienced a boom in recent years,
thanks to the emergence and success of online streaming services, which
nowadays make available almost all music in the world at the user's fingertip.
While today's MRS considerably help users to find interesting music in these
huge catalogs, MRS research is still facing substantial challenges. In
particular when it comes to build, incorporate, and evaluate recommendation
strategies that integrate information beyond simple user--item interactions or
content-based descriptors, but dig deep into the very essence of listener
needs, preferences, and intentions, MRS research becomes a big endeavor and
related publications quite sparse.
The purpose of this trends and survey article is twofold. We first identify
and shed light on what we believe are the most pressing challenges MRS research
is facing, from both academic and industry perspectives. We review the state of
the art towards solving these challenges and discuss its limitations. Second,
we detail possible future directions and visions we contemplate for the further
evolution of the field. The article should therefore serve two purposes: giving
the interested reader an overview of current challenges in MRS research and
providing guidance for young researchers by identifying interesting, yet
under-researched, directions in the field
Automatic Feature Engineering for Time Series Classification: Evaluation and Discussion
Time Series Classification (TSC) has received much attention in the past two
decades and is still a crucial and challenging problem in data science and
knowledge engineering. Indeed, along with the increasing availability of time
series data, many TSC algorithms have been suggested by the research community
in the literature. Besides state-of-the-art methods based on similarity
measures, intervals, shapelets, dictionaries, deep learning methods or hybrid
ensemble methods, several tools for extracting unsupervised informative summary
statistics, aka features, from time series have been designed in the recent
years. Originally designed for descriptive analysis and visualization of time
series with informative and interpretable features, very few of these feature
engineering tools have been benchmarked for TSC problems and compared with
state-of-the-art TSC algorithms in terms of predictive performance. In this
article, we aim at filling this gap and propose a simple TSC process to
evaluate the potential predictive performance of the feature sets obtained with
existing feature engineering tools. Thus, we present an empirical study of 11
feature engineering tools branched with 9 supervised classifiers over 112 time
series data sets. The analysis of the results of more than 10000 learning
experiments indicate that feature-based methods perform as accurately as
current state-of-the-art TSC algorithms, and thus should rightfully be
considered further in the TSC literature
- …