16,017 research outputs found
Dynamic Thresholding Mechanisms for IR-Based Filtering in Efficient Source Code Plagiarism Detection
To solve time inefficiency issue, only potential pairs are compared in
string-matching-based source code plagiarism detection; wherein potentiality is
defined through a fast-yet-order-insensitive similarity measurement (adapted
from Information Retrieval) and only pairs which similarity degrees are higher
or equal to a particular threshold is selected. Defining such threshold is not
a trivial task considering the threshold should lead to high efficiency
improvement and low effectiveness reduction (if it is unavoidable). This paper
proposes two thresholding mechanisms---namely range-based and pair-count-based
mechanism---that dynamically tune the threshold based on the distribution of
resulted similarity degrees. According to our evaluation, both mechanisms are
more practical to be used than manual threshold assignment since they are more
proportional to efficiency improvement and effectiveness reduction.Comment: The 2018 International Conference on Advanced Computer Science and
Information Systems (ICACSIS
The crustal dynamics intelligent user interface anthology
The National Space Science Data Center (NSSDC) has initiated an Intelligent Data Management (IDM) research effort which has, as one of its components, the development of an Intelligent User Interface (IUI). The intent of the IUI is to develop a friendly and intelligent user interface service based on expert systems and natural language processing technologies. The purpose of such a service is to support the large number of potential scientific and engineering users that have need of space and land-related research and technical data, but have little or no experience in query languages or understanding of the information content or architecture of the databases of interest. This document presents the design concepts, development approach and evaluation of the performance of a prototype IUI system for the Crustal Dynamics Project Database, which was developed using a microcomputer-based expert system tool (M. 1), the natural language query processor THEMIS, and the graphics software system GSS. The IUI design is based on a multiple view representation of a database from both the user and database perspective, with intelligent processes to translate between the views
Intent-Aware Contextual Recommendation System
Recommender systems take inputs from user history, use an internal ranking
algorithm to generate results and possibly optimize this ranking based on
feedback. However, often the recommender system is unaware of the actual intent
of the user and simply provides recommendations dynamically without properly
understanding the thought process of the user. An intelligent recommender
system is not only useful for the user but also for businesses which want to
learn the tendencies of their users. Finding out tendencies or intents of a
user is a difficult problem to solve.
Keeping this in mind, we sought out to create an intelligent system which
will keep track of the user's activity on a web-application as well as
determine the intent of the user in each session. We devised a way to encode
the user's activity through the sessions. Then, we have represented the
information seen by the user in a high dimensional format which is reduced to
lower dimensions using tensor factorization techniques. The aspect of intent
awareness (or scoring) is dealt with at this stage. Finally, combining the user
activity data with the contextual information gives the recommendation score.
The final recommendations are then ranked using filtering and collaborative
recommendation techniques to show the top-k recommendations to the user. A
provision for feedback is also envisioned in the current system which informs
the model to update the various weights in the recommender system. Our overall
model aims to combine both frequency-based and context-based recommendation
systems and quantify the intent of a user to provide better recommendations.
We ran experiments on real-world timestamped user activity data, in the
setting of recommending reports to the users of a business analytics tool and
the results are better than the baselines. We also tuned certain aspects of our
model to arrive at optimized results.Comment: Presented at the 5th International Workshop on Data Science and Big
Data Analytics (DSBDA), 17th IEEE International Conference on Data Mining
(ICDM) 2017; 8 pages; 4 figures; Due to the limitation "The abstract field
cannot be longer than 1,920 characters," the abstract appearing here is
slightly shorter than the one in the PDF fil
- …