76,413 research outputs found
Let Your CyberAlter Ego Share Information and Manage Spam
Almost all of us have multiple cyberspace identities, and these {\em
cyber}alter egos are networked together to form a vast cyberspace social
network. This network is distinct from the world-wide-web (WWW), which is being
queried and mined to the tune of billions of dollars everyday, and until
recently, has gone largely unexplored. Empirically, the cyberspace social
networks have been found to possess many of the same complex features that
characterize its real counterparts, including scale-free degree distributions,
low diameter, and extensive connectivity. We show that these topological
features make the latent networks particularly suitable for explorations and
management via local-only messaging protocols. {\em Cyber}alter egos can
communicate via their direct links (i.e., using only their own address books)
and set up a highly decentralized and scalable message passing network that can
allow large-scale sharing of information and data. As one particular example of
such collaborative systems, we provide a design of a spam filtering system, and
our large-scale simulations show that the system achieves a spam detection rate
close to 100%, while the false positive rate is kept around zero. This system
has several advantages over other recent proposals (i) It uses an already
existing network, created by the same social dynamics that govern our daily
lives, and no dedicated peer-to-peer (P2P) systems or centralized server-based
systems need be constructed; (ii) It utilizes a percolation search algorithm
that makes the query-generated traffic scalable; (iii) The network has a built
in trust system (just as in social networks) that can be used to thwart
malicious attacks; iv) It can be implemented right now as a plugin to popular
email programs, such as MS Outlook, Eudora, and Sendmail.Comment: 13 pages, 10 figure
QueRIE: Collaborative Database Exploration
Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the user’s querying behavior and finds matching patterns in the system’s query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these “similar” users and their queries to recommend queries that the current user may find interesting. In this work we describe an instantiation of the QueRIE framework, where the active user’s session is represented by a set of query fragments. The recorded fragments are used to identify similar query fragments in the previously recorded sessions, which are in turn assembled in potentially interesting queries for the active user. We show through experimentation that the proposed method generates meaningful recommendations on real-life traces from the SkyServer database and propose a scalable design that enables the incremental update of similarities, making real-time computations on large amounts of data feasible. Finally, we compare this fragment-based instantiation with our previously proposed tuple-based instantiation discussing the advantages and disadvantages of each approach
DataHub: Collaborative Data Science & Dataset Version Management at Scale
Relational databases have limited support for data collaboration, where teams
collaboratively curate and analyze large datasets. Inspired by software version
control systems like git, we propose (a) a dataset version control system,
giving users the ability to create, branch, merge, difference and search large,
divergent collections of datasets, and (b) a platform, DataHub, that gives
users the ability to perform collaborative data analysis building on this
version control system. We outline the challenges in providing dataset version
control at scale.Comment: 7 page
Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments
In this work, we present an extension of CORE [8], a tool for Collaborative Ontology Reuse and Evaluation. The system receives an informal description of a specific semantic domain and determines which ontologies from a repository are the most appropriate to describe the given domain. For this task, the environment is divided into three modules. The first component receives the problem description as a set of terms, and allows the user to refine and enlarge it using WordNet. The second module applies multiple automatic criteria to evaluate the ontologies of the repository, and determines which ones fit best the problem description. A ranked list of ontologies is returned for each criterion, and the lists are combined by means of rank fusion techniques. Finally, the third component uses manual user evaluations in order to incorporate a human, collaborative assessment of the ontologies. The new version of the system incorporates several novelties, such as its implementation as a web application; the incorporation of a NLP module to manage the problem definitions; modifications on the automatic ontology retrieval strategies; and a collaborative framework to find potential relevant terms according to previous user queries. Finally, we present some early experiments on ontology retrieval and evaluation, showing the benefits of our system
A Personalized System for Conversational Recommendations
Searching for and making decisions about information is becoming increasingly
difficult as the amount of information and number of choices increases.
Recommendation systems help users find items of interest of a particular type,
such as movies or restaurants, but are still somewhat awkward to use. Our
solution is to take advantage of the complementary strengths of personalized
recommendation systems and dialogue systems, creating personalized aides. We
present a system -- the Adaptive Place Advisor -- that treats item selection as
an interactive, conversational process, with the program inquiring about item
attributes and the user responding. Individual, long-term user preferences are
unobtrusively obtained in the course of normal recommendation dialogues and
used to direct future conversations with the same user. We present a novel user
model that influences both item search and the questions asked during a
conversation. We demonstrate the effectiveness of our system in significantly
reducing the time and number of interactions required to find a satisfactory
item, as compared to a control group of users interacting with a non-adaptive
version of the system
- …