34,995 research outputs found
QueRIE: Collaborative Database Exploration
Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the user’s querying behavior and finds matching patterns in the system’s query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these “similar” users and their queries to recommend queries that the current user may find interesting. In this work we describe an instantiation of the QueRIE framework, where the active user’s session is represented by a set of query fragments. The recorded fragments are used to identify similar query fragments in the previously recorded sessions, which are in turn assembled in potentially interesting queries for the active user. We show through experimentation that the proposed method generates meaningful recommendations on real-life traces from the SkyServer database and propose a scalable design that enables the incremental update of similarities, making real-time computations on large amounts of data feasible. Finally, we compare this fragment-based instantiation with our previously proposed tuple-based instantiation discussing the advantages and disadvantages of each approach
BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments
Advances in sequencing techniques have led to exponential growth in
biological data, demanding the development of large-scale bioinformatics
experiments. Because these experiments are computation- and data-intensive,
they require high-performance computing (HPC) techniques and can benefit from
specialized technologies such as Scientific Workflow Management Systems (SWfMS)
and databases. In this work, we present BioWorkbench, a framework for managing
and analyzing bioinformatics experiments. This framework automatically collects
provenance data, including both performance data from workflow execution and
data from the scientific domain of the workflow application. Provenance data
can be analyzed through a web application that abstracts a set of queries to
the provenance database, simplifying access to provenance information. We
evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree
assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a
RASopathy analysis workflow. We analyze each workflow from both computational
and scientific domain perspectives, by using queries to a provenance and
annotation database. Some of these queries are available as a pre-built feature
of the BioWorkbench web application. Through the provenance data, we show that
the framework is scalable and achieves high-performance, reducing up to 98% of
the case studies execution time. We also show how the application of machine
learning techniques can enrich the analysis process
Deep Learning for Link Prediction in Dynamic Networks using Weak Estimators
Link prediction is the task of evaluating the probability that an edge exists in a network, and it has useful applications in many domains. Traditional approaches rely on measuring the similarity between two nodes in a static context. Recent research has focused on extending link prediction to a dynamic setting, predicting the creation and destruction of links in networks that evolve over time. Though a difficult task, the employment of deep learning techniques have shown to make notable improvements to the accuracy of predictions. To this end, we propose the novel application of weak estimators in addition to the utilization of traditional similarity metrics to inexpensively build an effective feature vector for a deep neural network. Weak estimators have been used in a variety of machine learning algorithms to improve model accuracy, owing to their capacity to estimate changing probabilities in dynamic systems. Experiments indicate that our approach results in increased prediction accuracy on several real-world dynamic networks
Recommender Systems
The ongoing rapid expansion of the Internet greatly increases the necessity
of effective recommender systems for filtering the abundant information.
Extensive research for recommender systems is conducted by a broad range of
communities including social and computer scientists, physicists, and
interdisciplinary researchers. Despite substantial theoretical and practical
achievements, unification and comparison of different approaches are lacking,
which impedes further advances. In this article, we review recent developments
in recommender systems and discuss the major challenges. We compare and
evaluate available algorithms and examine their roles in the future
developments. In addition to algorithms, physical aspects are described to
illustrate macroscopic behavior of recommender systems. Potential impacts and
future directions are discussed. We emphasize that recommendation has a great
scientific depth and combines diverse research fields which makes it of
interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
WISER: A Semantic Approach for Expert Finding in Academia based on Entity Linking
We present WISER, a new semantic search engine for expert finding in
academia. Our system is unsupervised and it jointly combines classical language
modeling techniques, based on text evidences, with the Wikipedia Knowledge
Graph, via entity linking.
WISER indexes each academic author through a novel profiling technique which
models her expertise with a small, labeled and weighted graph drawn from
Wikipedia. Nodes in this graph are the Wikipedia entities mentioned in the
author's publications, whereas the weighted edges express the semantic
relatedness among these entities computed via textual and graph-based
relatedness functions. Every node is also labeled with a relevance score which
models the pertinence of the corresponding entity to author's expertise, and is
computed by means of a proper random-walk calculation over that graph; and with
a latent vector representation which is learned via entity and other kinds of
structural embeddings derived from Wikipedia.
At query time, experts are retrieved by combining classic document-centric
approaches, which exploit the occurrences of query terms in the author's
documents, with a novel set of profile-centric scoring strategies, which
compute the semantic relatedness between the author's expertise and the query
topic via the above graph-based profiles.
The effectiveness of our system is established over a large-scale
experimental test on a standard dataset for this task. We show that WISER
achieves better performance than all the other competitors, thus proving the
effectiveness of modelling author's profile via our "semantic" graph of
entities. Finally, we comment on the use of WISER for indexing and profiling
the whole research community within the University of Pisa, and its application
to technology transfer in our University
Target Apps Selection: Towards a Unified Search Framework for Mobile Devices
With the recent growth of conversational systems and intelligent assistants
such as Apple Siri and Google Assistant, mobile devices are becoming even more
pervasive in our lives. As a consequence, users are getting engaged with the
mobile apps and frequently search for an information need in their apps.
However, users cannot search within their apps through their intelligent
assistants. This requires a unified mobile search framework that identifies the
target app(s) for the user's query, submits the query to the app(s), and
presents the results to the user. In this paper, we take the first step forward
towards developing unified mobile search. In more detail, we introduce and
study the task of target apps selection, which has various potential real-world
applications. To this aim, we analyze attributes of search queries as well as
user behaviors, while searching with different mobile apps. The analyses are
done based on thousands of queries that we collected through crowdsourcing. We
finally study the performance of state-of-the-art retrieval models for this
task and propose two simple yet effective neural models that significantly
outperform the baselines. Our neural approaches are based on learning
high-dimensional representations for mobile apps. Our analyses and experiments
suggest specific future directions in this research area.Comment: To appear at SIGIR 201
- …