13,693 research outputs found
Exploiting Cognitive Structure for Adaptive Learning
Adaptive learning, also known as adaptive teaching, relies on learning path
recommendation, which sequentially recommends personalized learning items
(e.g., lectures, exercises) to satisfy the unique needs of each learner.
Although it is well known that modeling the cognitive structure including
knowledge level of learners and knowledge structure (e.g., the prerequisite
relations) of learning items is important for learning path recommendation,
existing methods for adaptive learning often separately focus on either
knowledge levels of learners or knowledge structure of learning items. To fully
exploit the multifaceted cognitive structure for learning path recommendation,
we propose a Cognitive Structure Enhanced framework for Adaptive Learning,
named CSEAL. By viewing path recommendation as a Markov Decision Process and
applying an actor-critic algorithm, CSEAL can sequentially identify the right
learning items to different learners. Specifically, we first utilize a
recurrent neural network to trace the evolving knowledge levels of learners at
each learning step. Then, we design a navigation algorithm on the knowledge
structure to ensure the logicality of learning paths, which reduces the search
space in the decision process. Finally, the actor-critic algorithm is used to
determine what to learn next and whose parameters are dynamically updated along
the learning path. Extensive experiments on real-world data demonstrate the
effectiveness and robustness of CSEAL.Comment: Accepted by KDD 2019 Research Track. In Proceedings of the 25th ACM
SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD'19
MinoanER: Schema-Agnostic, Non-Iterative, Massively Parallel Resolution of Web Entities
Entity Resolution (ER) aims to identify different descriptions in various
Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the
Variety, Volume and Veracity of entity descriptions published in the Web of
Data. To address them, we propose the MinoanER framework that simultaneously
fulfills full automation, support of highly heterogeneous entities, and massive
parallelization of the ER process. MinoanER leverages a token-based similarity
of entities to define a new metric that derives the similarity of neighboring
entities from the most important relations, as they are indicated only by
statistics. A composite blocking method is employed to capture different
sources of matching evidence from the content, neighbors, or names of entities.
The search space of candidate pairs for comparison is compactly abstracted by a
novel disjunctive blocking graph and processed by a non-iterative, massively
parallel matching algorithm that consists of four generic, schema-agnostic
matching rules that are quite robust with respect to their internal
configuration. We demonstrate that the effectiveness of MinoanER is comparable
to existing ER tools over real KBs exhibiting low Variety, but it outperforms
them significantly when matching KBs with high Variety.Comment: Presented at EDBT 2001
Counteracting Bloom Filter Encoding Techniques for Private Record Linkage
Record Linkage is a process of combining records representing same entity spread across multiple and different data sources, primarily for data analytics. Traditionally, this could be performed with comparing personal identifiers present in data (e.g., given name, surname, social security number etc.). However, sharing information across databases maintained by disparate organizations leads to exchange of personal information pertaining to an individual. In practice, various statutory regulations and policies prohibit the disclosure of such identifiers. Private record linkage (PRL) techniques have been implemented to execute record linkage without disclosing any information about other dissimilar records.
Various techniques have been proposed to implement PRL, including cryptographically secure multi-party computational protocols. However, these protocols have been debated over the scalability factors as they are computationally extensive by nature. Bloom filter encoding (BFE) for private record linkage has become a topic of recent interest in the medical informatics community due to their versatility and ability to match records approximately in a manner that is (ostensibly) privacy-preserving. It also has the advantage of computing matches directly in plaintext space making them much faster than their secure mutli-party computation counterparts. The trouble with BFEs lies in their security guarantees: by their very nature BFEs leak information to assist in the matching process. Despite this known shortcoming, BFEs continue to be studied in the context of new heuristically designed countermeasures to address known attacks.
A new class of set-intersection attack is proposed in this thesis which re-examines the security of BFEs by conducting experiments, demonstrating an inverse relationship between security and accuracy.
With real-world deployment of BFEs in the health information sector approaching, the results from this work will generate renewed discussion around the security of BFEs as well as motivate research into new, more efficient multi-party protocols for private approximate matching
Generative Filtering for Recursive Bayesian Inference with Streaming Data
In the streaming data setting, where data arrive continuously or in frequent
batches and there is no pre-determined amount of total data, Bayesian models
can employ recursive updates, incorporating each new batch of data into the
model parameters' posterior distribution. Filtering methods are currently used
to perform these updates efficiently, however, they suffer from eventual
degradation as the number of unique values within the filtered samples
decreases. We propose Generative Filtering, a method for efficiently performing
recursive Bayesian updates in the streaming setting. Generative Filtering
retains the speed of a filtering method while using parallel updates to avoid
degenerate distributions after repeated applications. We derive rates of
convergence for Generative Filtering and conditions for the use of sufficient
statistics instead of fully storing all past data. We investigate the
alleviation of filtering degradation through simulation and Ecological species
count data
- …